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C54) Title: HEUCOBACTER PYLORI PROTEINS USEPIIL POR VACCTNF*; AND nTAGNnSTTCS 
(57) Abstract 

yelicfihaaa^ pylori in known to cauM or ht a cofactor in type B gnsuitu, peptio uloen, and gtutric tomors. In both deve- 
loped and devetoptnf! countries, a Mjib peiceotage of people are mfected vich this baaerium. ihc piEsent invention relates ecn- 

erally to certain H. pylori proteins, \Q the genes which uprui thete proteins, and to the use of these pfoteina for diagnostie find 
vaccine appllcailoos. Specifically, molecular cloning, nucleotide, and afflino acid sequences for Jht It pylori, cytocoxin (Cl). the 
"'Cytotoxin Axcodated Immunodominant^' (CAT) antigco, and the heal shock protein (hsp^) are daceribed hemn. 
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HELieOBA^ TO PYLORI VXirrrwA j^ 

BACKGROUND OP TBE INVENTION' 
i4 Field of the Pise1^,,»>- 

The present invention r.latoc ganorally to certain 
Kaliepba<;t-...r pylori proteins, to the genes wbicn express 
these proteins, and to the use of these protein- for 
dAagnoetio and vaccine applications. 

Ij Brief Desc ri ptlCH of Poia'^prt >T-t- 

P6lieobaetP.r pylori is a curved, microaerophllio, 
qram negative hacteriuTn that has been isolated for the first 
time in 1082 from stomach biopsies of patients with chronic 
gastritis, warren et ai., lancet i:l273-75 (1983). 
originally na»ed Caapy^otod ^.^ pylgrl. it has hp*n 
recognized to be part of a separate genua naaed 
pelx=o>,actor , Goodwin et al., J. syst. BacterioX. 

39:397-405 (1989). The bacteriur. coloniacs the human 
gastric mucosa, and infection can persist for " decades 
During the last tew years, the presence of the bacterium has 
been aeeociated with chronic gastritis type B. a condition 
that may remain asymptomatic in most infaeted persons but 
ancreaeec considerably the risX of peptic ulcer and gastric 
adenocarcinoma. The most recent studies strongly suggest 
that H, pY;ffrA infection may be eiUier a cause or a cofaotor 
or type B gastritis, peptic ulcers, and gastrlo tumors, see 
e.g., Blaear, Gactroenterology 93:371-83 (1987); Dooiey 
al.. New ingl. J. Med. 32]:lRfi2-fi6 (1989); Parsonnet et 
al., New Engl. J. Med. 325:1127-31 (1991). H. Taylor, is 
bulieved to be transmitted by. the oral rout,; Thomas et al 
Lancet 1:340, 1194 (1992), and the risk of infer.tion 
increases wltn age, Graham et al. , Caetrocnterology 
100:1495-1601 (1991), and i, facilitated by crowding. Drum™ - 
et al., New Engl. J. Med. 4322:359-63 (1990); Blaser, Clin 
mtect. Dis. iS:38fi.-93 (1993). In developed countries, the 
presence of antibodies against H._EaHCi antigens increacoc 
from less than ?.o* to cv« 50% in people 30 and 60 years old 
respectively, Jones eL dl., Med. Mlcrobio. 22:57-63 (1986)- 
Mom. et al., N.Z. Ked. J. 99:657-59 (1986), while In 
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Ueveloplng countries over bu% or tfte population aro alroady 
infected by th^ acpe of 20, Graham fet al. , Digestive Diseases 
and Sciences 36:1084-88 C1991} . 

ThQ nature and the role of the virulence fdctors 
or H, PYlo;:;^ are still poorly understood. Tha f;*rtors that 
have been identified co far include the flagella that are 
probably necessary to move across the mucus layer, see p..g, , 
Leying At al., Mol. Microbiol. 6:3863-74 (1992); the* urease 
that is necessary tu neutralize the acidic environment of 
the stomach and to allow initial colonization, see e.g., 
Cussacetal,, J. Bacteriol. 174:2466-73 (1992); Peraz-Perez 
ftt al,, J. Infect- Iimnun. 60:3658-3663 (1992); Austin et 
al., J, Bacteriol. 174:7470-73 (1992); PCT Publ. No, WO 
90/04030; and a high molecular weight cytotoxic protein 
formed by monomers allegedly having a moleciUar weight of 87 
kDa that caueec formation of vacuoles in eiiXaryotic 
epithelial cells and is produced by H. pvio-ri strains 
associated with dieeaee, coo e.g., Cover et al. , J. Bio. 
Chem. 267:10570-75 (1992) (referencing a "vacuolating toxin" 
with a specified 33 amino acid N-terminal sequence) ; cover 
et al., J. Clin, invest- 90:913-18 (H»92) ; Leunk, Rev. 
Infect. Dis. 13:5686-89 (1991), Additionally, the roll owing 
is also Known. 

H, pylori culture cupcmatanta have been showxi by 
different authors tu contain an antigen 'with a molecular 
weight of 120, 128, or 130 kDa, Apel ct al. , Aentralblat fur 
Bakteriol. Microb. und Hyqiene 268:271-76 (1988) ; Crabtree 
et al., J. Clin. Pathol 45:733*34 (1992); Cover et al. , 
Infect. Immun, 58:603-io (1990); Figura et al., H. nvinyi' 
gflSfriti'^ aT^d nAntic ulcer (ods, Malfrthciner et al*) , 
Springer Verlag, Berlin (1990). Whether the difference irs 
size of the antigftn described vac duo to intcrlaboratory 
differences in estimating the ^uulecular weight of the same 
protein, to the size variability of the same antigen, or to 
3 5 actual different proteins was not clear. No nucleotide or 
djnino acid seouence intormation was givi?n about tho protein. 
This protein is very immunogGnic in infected humans because 
specific antibuditts are detected in sera of virtually all 
patientA infected with H- pylori, Gerst^necker et al., Eur. 
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J. Clin. MiorobioZ. 11:535-601 fl992} . 

flj — PV^Qrj heat shock ni-o*--,-*.- * . 

d««eribed, Evana et ai t 7 P'''^^"' (hsp) have been 

{44 amino add N-terminal eecuaneo »„j , ^ 
about « ^a, ; Du„„ et «i tT/ . ' »«l«cular welaiir or. 

»oi..a,. ..,,bt .bout 34 J:;rr«:rr:.^"%^ 

Ca.t,oc«te,o.o^ „...o'o..O. C^:.'". r^.tr^'c. 

co„t.„., .... - 

*nt.ge„s (300-700 )cn«) frc» th. outer *.^rane «urTacrv!^. 
ur.a« activity,; ^ No. 329 S70 

«»ti,enlc compositions for d*tecti„a « ^ to 
>»-vi„g one rral^rf^ «:tlbo<iies 

43, ana .1 icBa, . ^« 

The percentage or people infected by-H ovlori 
either in a symptomatic or an a™^« ^ PVlori , 

<9<«rTr,^o.H» ^ ^ — Eilici£A vaccines and further 
diagnostic tests for this rtiseaca. «"ner 

bUmAHY OF THE IHVENTION 
The present invention describes nnrieotide and 
™ino acid seguences for three ma-i=r « , -^eotiae and 
Sp-cifically, th==c ere tT. ! ^-S^^ proteins. 

..c.o^in.„t ^^-^ ^^'^^^.o 
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and hoct colic. Tha underctandiricr at the aolccular level of 
the nature and tna role of these proteins and the 
availability of roconbinant production has , important 
implications for the development of nev diagnostics for 
5 pylori and for tha docign of vaccinoe that may prevent 
pyXori infection and treat disease. 

Ac cuch, thccc proteins can be used in both 
vaccine and diagnostic applications. The present invention 
includes methodc for treating and diagnosing those diseases 
10 associated with Jf. pylori. As H. ovior i has been associated 
with type B gastritis, peptic ulcers, and gastric 
adenocarcinoma, it is hoped that the present invention will 
assist in early detection and alleviation of thccc disease 
states. Currently, diagnosis relies mostly on endoscopy and 
histological etaining of biopeies; existing immunoassays are 
based on «. pyiorA lysares or semi-pur ir led antigens. Given 
the heterogeneity found in such assays, correlation with 
disease state is nuL yet well established. Thus, the 
potential for recombinant antigon-baced immunoassays, as 
0 well as nucleic acid dssdys for disease detection, is great. 
At present, there is no commercial vaccine for pvlpri 
iiiTectiuxi or treatment. A recombinanr vaccine is thus an 
object of the pracant invention. 

5 -BRIZT DESCRIPTION OF THE DRAWIITCS 

Fiy- 1 is the nucleotide sequence for the 
cytotoxin (CT) protein. 

Fig. 2 is the caniuu acid sequence for the 
cytotoyin (CT) protein. 
0 Pig. 3 is a map or Uiu. c^i gene for the CAi 

protAin and summary of the clones used to identify and 
secjuence this gene. 

Fig. 4 iff t-.h« nucleotide and amino acid sequences 
of the CAl antigen. 
5 fig. b is rhe nucleotide and smlno acid sequoneoe 

of tha hoat chock protein (hsp) . 



DETAILED DESCRXPTIOH OF THE INVENTIOK 
G^ndrAl M^thodolQCTv 
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The pracclca or me prasont invention will employ, 
unless otherwiea indicated, conventional tecnnl<jues of 
nolecular biology, microbiology, recoabinant DNA, and 
imrnmology, which are within the skill or the art. such 
techniques axe explained fully In the literature. See e.g., 
Saabrook, et al., MOLECDIAR CLOKING; A LABORATORY MANUAL, 
SECOND EDITIOK f 1989) ; DNA CTOMING, VOLUMES I AND XI (D.N 
Glover ed. 1985); OLIGONUCLEOTIDE SYNTHESIS (M.J. Gait ed, 
1984) ; NUCLEIC ACID HYBRIDIZATION (B.D. Hamee s s.J. Higgins 
edfl. 1984); TRANSCRIPTION AND TRANSLATION (B.D. Hames i S.J. 
niggins eds. 1984); ANIMAL CELL CUI.TORE (R.I. Prechncy ed. 
3«J«fi); IMMOBILIZED CELLS AND ENZYMES (IRL Press, 1986); B. 
Perbal, A PRACTICAL GUIDE TO MOT.ECULAR CLONING (1984) ; the 
series, METHODS IN EHZVMOLOGY (Academic Press, inc.); GENT 
TRANSFER VECTORS FOR MAMMALIAN CELLS (J.H. Miller and M.P. 
Calos eds. ,1987, Cold Spring Harbor Laboratory), Methods in 
Enzymology Vol. 154 and Vol. 155 (Wu and Croccman, and Wu, 
eds., respectively), Mayer and Walker. eds. (1987), 
IMMDNOCHEMICAL METHODS IN CELL AND MOLECULAR' BlOLOsi 
20 (AMdeiDic Praee, London), Scopes, (1987), PROTITN 
PURIFICATION: PRINCIPLES AND PRICTICE, Second . Edition 
(Springer-Verlag, N.Y.), and HANDBOOK OJ EXPERIMENTAL IH- 
MDNOLOGY, VOLUMES I-IV (U.M. Weir snd C. C. BlacJcwell eds 
1986) . 

Standard abbreviations for nucleotides and aaino 
acids are used in thio specification. All publications, 
patent., and patent applications cited herein are 
incorporatAd by reference. 

"Cytotoxin" or "toxin" ol H. nvlor-i refers to the 
protein, and fragnients thereof, whose . nucleotide sequence 
and amino acid sequences are shown in Figs, i and 2, 
respectively, and their derivatives, and whose molecular 
35 weight ic about 140 kDa. This protein seizes as a preeureor 
to a protein having an approxinate weight of lOO kDa and 
having cytoxic activity. The cytotoxin Causes vac«olation 
and death of a number or eukaryotic cell typ=3 and haa been 
purified froB I L, 0^9^ culture supernatants. Addit.i onally 
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tlie cytotoxln is protelnaceaus and nas an apparent moloeular 
aacB datomincd by gel filtration of approximately 950-972 
)cDa. Denaturing gel electrophoresis of purified natoriai 
pravioucly revealed that the principol component or Uie 930- 
5 972 )a?a molecule was allegedly a paiypeptidft of apparent 
molocular mass of 87 kDa, Cover ct al. , j. Biol, Chem. 
267:10570-73 1992). It Is suggested herein, howavftr, that 
tha previously doecribcd 87 kDa results from eithes: the 
further processing of the mo JcDa protein or from 
10 proteolytic degradation of a larger protein during 
purification. 

The "Cytotoxin Accooiatcd Immunodominant" (CAI) 
eaitigen refers to that protein, and fragments therAof , whose 
amino acid so<juance ie described in Fig, 4 and derivatives 
15 thereof. This is an nyorophilic, surface-expQ<?<?.d protein 
having a molocular weight of approximately 120-132 XDa^ 
preferahly 128-13 0 KDa, produced hy clinical ieolates. The 
cize of the gene and of the encoded protein varies in 
different strains by a mechanism that involves duplication 
20 of regions internal to the gene. The clinical isolates that 
do not produce the c&I antigen, do not have thA gene, 
and are also vmabla to produce an active cytotoxin. The 
association between the presence of the calf jenA and 
cytotoxicity suggests that the product of the cai gene is 
25 necessary for the Lranscription. roldlng, export or function 
of the cytotoxin. Alternatively, both the cytotoxin (CT) 
and the cai gene are absent in noncytotoxic strains;. This 
would imply som^ physical linkage between the two genes, A 
peculiar property of the CAI antigen is thP. size 
30 variability, suggesting that the caj. gene is continuously 
changing. The CAI antigen appears to be associated to thA 
cell surl^acft. This suggests that the release of the antigen 
in the supernatant may be due Lo the action Of proteasAs 
present in the serum that may cleave oither the antigen 
35 iteelf, or the complexes that hold the CAI antigen 
associated to the bacteria] surface. Similar proocsaing 
activities may release the antigen during 1q vivo growth. 
TU*t absence of a typical leader peptide sequence suggests 
the presence of an independent ek-port systeiu. 
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•'H«at shocX protein" (hsp) rerers to tho E,_pvlorx 
protein, and rrag,n«nts thar.of , vhocc aaino acid s«qu«,c. Is 
givon in Pig. s and derivative, th.r.of , and who,, »ol.cular 
weight is in the range of 54-62 JcDa, preferably about Sa-,0 
^n«. Thi. hsp bclonga to the .roup or Graa n«gativ 
bacteria heat shocJc proteins, hsp60. i„ ^ene^l, hsp are 
«»o„g th« »oet conserved proteins in all living organi,„e, 
exther proJcaryotic and euJcaroytic, animals and plants, and 
the conrervation is spread along the Whole se^enee. Thie 
high conservation sugg^ts a participation of the whole 
sequence at the functional structure or the protein that can 
be hardly modified without impairing its activity. 

Examples of proteins that can be used in the 
pre«„t invention include polypeptides with minor amino acid 
variations fro» the natural amino acid sequence of the 
protein; in particular, conservative a»ino acid replacements 
are contemplated. Conservative replacements are those that 
take Place within a family of amino acids that are related 
m their eide chains. Genetically encoded amino aoide: are 
generally divided into four families: (^j ^^iaic = 
aspartate, glutamate; (a, hasic - ly.m.. argininc, 
histidxne; non-polar = alanine, ^valine, leucine 

isoleucme, proline, phenylalanine, methionine, triptophan' 
and (4) uncharged polar =, glycine, a=p=ragine, glutamine! 
-y-tme. corine, threonine, tyrosine. Phenylalanine, 
tryptophan, and tyrosine ar« ^ometimee classified jointly as 
aromatic amino acids. For example, it is reasonably 
predictable that an isolated replacement of a leucine with 
ai, isoleucme or valine, an aspartate with a glutamate, a 
threonine with a serine, or a «milar conservative 
replacement of an amino acid with a structurally related 
ammo acid will nor have a major effect on the biological 
activity. Polypeptide molecules having substantially the 
«»= amino acid sequence as the protein but po«e«i„g minor 

amino acid substitutions that d« 1. 4. 

utions uiat do not substantially affect 

the functional aspects are within the definition of the 



A =ignificant advantage ot producing the protein 
by recombinant DNA techniques rather than by isolating and 
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purifying a protein from natural sources ia that equivalent 
guantitiea of th* protein caai Le producea w Using less 
starting material than would be rp.emir«d for isolating the 
protein from a natural aource. Producing Lhe protein by 
5 recombinant techniques also permits the protein to be 
icolated in the absence of some molecules normally present 
in cells, indeed, protein compositions entirely free of any 
trace of human protein contaminants can readily be produced 
because the only human protein producAd by the recombinant 
10 non-human hoet is the recombinant protein at issue. 
Potential viral agents rrom natural sources and viral 
componentc pathogenic to humans are alsu avoided, 

•ilhe term "recowhinant polynucleotide" as used 
herein intends a polynucleotide of genomic,- cDNA, 
semieynthetic, or synthetic origin which, by virtue of its 
origin or manipulation: (i) is not associated with all or a 
portion Of a polynucleotide vith which it is aoeociatcd in 
nature, (2) io linked to e polynucleotide other than that to 
which it is llnxed in nature, or (3) does not occur in 
20 nature, Thuc, thia term also encompasses the situation 
wherein the H. pylori bactPrium genome is genetically 
modified (eg,, through mutagenesis) to produce one or more 
altered polypeptides. 

The term "polynucleotide" as used herein refers to 
a polymeric form of a nucleotide of any length, preferably 
deoxyribonucleotidec, and is used interc'hanqeahly herein 
wiUi the terms "oligonucleotide" and "oligomer." The term 
refers only to the primary structure of the mole^^ule. Thus, 
this term includes double- and singl fi-stranded DN^, as well 
as antisense polynucleotider: . it also includes Known types 
of modiXications, for example, the presp.nr*? of labels which 
are known in the art, methylation, end "caps," substitution 
of one or more oC Lhe naturally occurring nucleotides with 
an analog, intP.T-nucleotide modifications such as, Xor 
3S example, replacement with certain types of uncharged 
linxages (e.g. , methyl phosphonates, pho sphotri esters , 
phosphoamidates, carbamates, etc.) or chargeo linV;.ges 
(«.g., Phospnorothioates, phosphorodithioatcs , etc ) 
introduction of pendant moieties, such as, for' exampip.! 



25 



30 



wo 93/18150 



rCr/Er93/00472 



protain. (including nucLaaea, toxi,.s, antibodies, signal 
peptides, poly-L-lysine, etc.), int«ealatora (..g 
acridin., psoralen, etc.), chelators (e.g.. aetals' 
radioactive species, boron, ovidative »oi.tica, etc.),' 
aUtylatorc (e.g., alpha anoaeric nucleic acids, etc.). 

by "genomic" is meant a collection or library of 
DNA molecules which are derived from restriction fragments 
that nave heen cloned in vector*. This may include all or 
part of the genetic aaterial oX an organism. 

fly "CDNA" is meant a complimentary bhha sequence 
that hybridises to a complimentary strand of biRha. 

AS used herein, the term "oligomer" refers to both 
primers and probes and is used interchangeably herein with 
the term "polynucleotide." The term oligomer does not 
connote the sire of the molecule. However, typically oligo- 
mers are no greater than lOOO nucleotides more typically 
are no greater than 500 nucleotides, even fflore typically are 
no greater than 250 nucleotides; they may be no greater than 
100 nucleotides, and may be no greater than 75 nucleotides 
and also may be no greater than 50 nucleotides in length. ' 

The term "primer" as uaed herein refers to an 
oliyomer which is capable of acting as a goint of initiation 
of synthecie of a polynucleotide strand when used unHor 
appropriate conditions. The primer will be completely or 
substantially complementary tu a region of the 
polynucleotide strand to be copied. Thuc, under conditions 
conducive to hybridisation, the primer will anneal to the 
complementary region of the ;.n^lyte strand. Upon addition 
of suitable reaetants, (e.g., a polymerase, nucleotide, 
triphosphates, and the llXe) , the primer will be extended by 
the polymerizing agent to form a copy of the analyte strand. 
The- primer may be single-stranded or alternatively may be 
partially or fully double-stranded. 

The terms "anaiyre polynucleotide" and "analyte 
strand" refer to a singi,. „ double-.trended nucleic acid 
moloouic which i, suspected of containing a target sequence 
and which may be present in a biological sample. 

Ac used herein, the Lexm "probe" refers to a 
structure comprised of a polyn„cleotide which forms a hybrid 
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Structure witH a rarget Sft<iuence, aue to coTnpl©Bi«ntarily ©f 
at leaet on© cctjucnce in the probe with a sequence in the 
target region. The polynucleotide region?? of probas laay be 
compoeed of DNA, and/or RNA, and/or synthetic nucleoLide 
analoqs. Included within probes are "capture probes" and 
"labal probee". 

AS xised herein, the term "target region" refers to 
a region of the nuolcic acid vhich is to be amplified and/ or 
detected. The term "target sequence" refers to a seejaence 
with which a probe or primer vill form a stable hybrid under 
desired conditions. 

Tha term "capture probe" as used herein refers to 
d polynucleotide prohe comprised of a sing7ft-<5tranded 
polynucleotide coupled to a binding partner. The 
single-stranded polynucleotide is comprised of a targeting 
polynucleotide sequence, which is complementary to a. target 
sequtiiiue in a target region to be detected in the analyte 
polynucleotide. This complementary region is of sufficient 
length and complemenrarily to the target seguence to afford 
a duplex of stability which ic cuff icient to iaanobilise the 
analyte polynucleotide to a solid surface (via tAe binding 
partners) . The binding partner is speqific for' a second 
bijiOinq piirtner; the second binding partner can be,bound to 
the surface of a eolid eupport, or may. be linked indirectly 
vitt other structures or binding partners to a solid support. 

The term "targeting polynucleotide sequence" as 
n5^M herein refers to a polynucleotide sequence which ias 
comprised of nucleotides which are complementary to a target 
nucleotide se.quence; the sequence is of sufficient length 
ajid complementarily with Uie tarqet sequence to term a 
dupley which has sufficient stability for the purpose 
intended. 

The term "bindintj partner" as' u^ad herein refers 
to a aoleexile capable of binding a ligand molecule with High 
specificity, as for example an antigen and an antibody 
specific therefor. In general, the specific binding part- 
ners ixiusti bind with sufficient affinity to iTinnobiliza the 
analyte copy/conplementary ctrand duplex (in the case of 
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capture probes) U3ider the isolation conditions. Specific 
binding partner s are known In the art, emd include, for 
example, biotin and avldln or streptavidln, TgC: and protain 
A, the nufflArous known receptor- ligand couples, and 
ooapleaentary polynucleotide strands. In the case of 
complementary polynucleotide binding partnero, the partners 
are normally at least about 15 bases in length, and. may be 
at least. 40 bases in length; in addition, they have a 
content of as and c^i of at least about 40* and as mucb as 
about 60%. The polynucleotides way be composed of DNA, RKA, 
or cyiithetic nucleotide analogs. 

The term "coupled" ae uccd herein refers Lo at- 
tachment by covalent bonds or by strong non-covalent 
interactions (e.g., hydrophobic interactions , hydrogen 
15 bonds, etc.)- Covalent bonds may be, for example, ester, 
ether, phosphoester , amide, peptide, imide, carbon-sulfu^ 
bonds, carbon-phosphorus bonds, and the like. 

The term "support" refers to any solid or 
eemi-colid surface to which a desired binding partner may be 
20 anchored. Suitable supports include glass, plastic, jnetal, 
polymer gels, and the like, and may take the form of beadc,' 
wells, dipsticks, membranes,, and the like. 

The term "label" as used herein refers tetany atom 
or moiety which can be used to provide . a detectable 
(preferably quantifiable) signal, and which can be attached 
to a polynucleotide or polypeptide. 

As used herein, the term "label probe" refers to 
a polynucleotide probe which is compriccd of a targetinq 
polynucleotide aequence which ia complementary to a target 
sequence to be detected in the analyte polynucleotide. This 
complementary region is of sufficient length. and 
complementarily to the target sequence to afford a duplex 
comprised of the "label probe" and tlie "target sequence" to 
be detected by the label. The label probe ic coupled to a 
35 label either directly, or indirectly via a set of Ugand 
molecules with high specificity for each other, including 
multimers. 

The term "multimer, " as useH herein, refers to 
linear or branched polymers of the s^^ne repeating 
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singl.-9trand«d polynucleotide unit or differant 
slngle-stranded polyni,cl«otid« units. At l««t oae of the 
unite has a sequence, length, and composition tnat permits 
it to hybridize specifically to a first single-otrandcd 
nucleotide sequence of interest, typic^tUy an anaiyte or a 
polynucleotide probe (e.g., » latoi probe) bound to an 
anaiyte. m order to achieve such speclTlclty and 
stability, this unit win normally be at leaet about 15 
nucleotides in length, typically no aore than about 50 
nucleotides iJi length, and preferably about 30 nuolcotidee 
in length; moreover, the content of Gs and Cs will nomaily 
be at least about 40%, and »t nast about 60*. in addition 
to euch unit (3), the Bultiaer includes a multiplicity of 
units that are capable of hybridizing specifically and 
IS etably to a second single-stranded nucleotide or interest, 
typically a labeled polynucleotide or another multiner' 
Theoo units are generally about the same size and 
composition as the multiaers diseuesed above. When a 
multiaer ia designed to be hybridized to another nultiner 
20 the first and second oJ 1 gomicleotide unite are heterogeneous 
(different) , and do not hybridize with each other under the 
conditions of the selected aR.<:ay. Thus, aultiaero aay be 
label probee, or may be ligand. which couple the label to 
the probe. 

* "replicon" is any genetic element, e.g., a 
plasmid, a chronosome, a virus, a cosmid, etc. that behaves 
ae an autonoaoue unit of polynucleotide replication vithin 
a cell; i.e.. capable ot replication under its own control. 
This nay include eelcctable aarkera. 
^° ^e^-'^S to the technique of polyaeraeo chain 

reaction as described in Saiki. et al. , Nature 324:163 
(1986); and Scharf et al., science (1986) ?.33: 1076-1078; and 

U.S. iI,fiR'3,195; and U.S. 1,683,202. 

Aa used herein, x is "heterologous" with respect 
35 to y it X is not naturally aeseoiated with y in the 
Identical aanner; i.e., x is not associated with y in nature 
or X is not associated with y in the ca»c manner as is found 
in nature. 

"Hoaology- refers to the degree of sinilarity 
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between x and y. The correspondence botweon tho sequence 
rron one torm to another can be determined by techniques 
toown in the art. For example, they can be determined by a 
direct comparison r^f the cecjuenoc inforaetion or tHe 
polynucleotide. Alternatively, homology can he detenninod 
I3y hybridization of the polynucleotides under conditions 
which form stable duplexes between homologous regione (for 
example, those which would be uccd prior to digestion), 
followed by digestion with single-stranded specific 
nuclease (s), followed by cize determination or the digested 
fragments . 

A "vector" is a rcplicon in which another 
polynucleotide segment is attached, so as to bring about the 
replication and/or expression of the attached segment. 

"<=ontrol sequence" refers to polynucleotide 
sequences wh.ir.h are necGssary to effect the eacpression of 
coding sequences to which they are ligated. The nat:ure of 
such control sequences differs depending upon the host 
organism; in prolcaryotes , such control sequences generally 
include promoter, rl.hoscmal binding site, and trcinscrlption 
termination sequence; in euXaryotes, generally, such. control 
sequences include promoters and transcription termination 
sequence. The term "control sequences" is intended to 
include, at a minimum, all components whose presence is 
necessary for expression, and may also include additional 
components whose presence is advantageous, for example, 
leader sequences and fusion partner sequAnces, 

"operabiy linked" refers to a juxtaposition 
wherein the components so described are in a relationship 
permitLing them to function in their intended manner, • A 
control sequence "operably linXed" to a coding *.f»quencfl ic 
ligated in auch a way that evpreesion of the coding sequence 
is achieved under conditions compatible with tha control 
sequences. 

" ^ "^^P**^ reading frame" (CRT) is a region of a 

polynucleotide sequence which encodes a polypeptide; this 
reoion may repr«^ent a portion of a coding sequence or a 
total coding sequence. 

A "coding ^;pquenco" ic a polynucleotide sequencA 
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Which is translatea into a polypeptide, usually via aiaa 
wh«n placed under the contrcl of appropriate requlator^ 
sequences. The Counaarles ot the coding sequ«ice ar* 
determined by a tranelation rtart codon at the 3'-tarmluus 
«md a transldtion stop codon at the j' -terminus, a coding 
sequence can include, but is not limited to, cDKA, and 
recombinant polynucleotide sequences. 

As used herein, the term "polypeptide" re'rexa to 
a polymer or amino acids and does not refer to a specifio 
length of the product; thuo, peptides, oligopeptides," and 
proteins are included wltnin the definition of polypeptide 
This term also does not refer to or exclude post expression 
nodirications of the polypeptide, for example, 
glycosylations, acetylationss, phosphorylations and the like ' 
15 included within the definition are. for exaapl.,* 
polypeptides containing one or more analogs of an amino acid 
(including, for example, unnatural amino acids, etc.), 
polypeptides with eubetitutcd linkages, as well Is oth« 
modifications Xnown in the art. both naturally occurring and 
20 non-naturally occurring. 

A polypeptide or amino acid sequence "derived 
from" a designated nucleic acid sequence refers to a 
polypeptide having an amino acid sequence identical to that 
of a polypeptide encoded in the sequence, or a portion 
25 thereof wherein the oortlon consists of at least 3-5 amino 
scids, and more preferably at least 6-10 amino acids, and 
even more preferably at least 11-15 amino acids, or which is 
immunologics l.i.y identifiable with a polypeptide encoded ia» 
the sequence. This teriuinology also Includes a polypeptide 
30 ejcpressed from a designated nucleic aoid sequence. 

"Immunogenic" rerers to the ability of a 
polypeptide to cause , humoral and/or cellular lamune 
rccponse, whether olone or when linked to a carrier, in the 
presence or absence o£ an adjuvant. "Neutralization" refers 
as to an immune response that blocks the infactivity, either 
partially or fully, ot an irif^ctious agent. 

"Epitope" refers to an antigenic determinant of a 
peptide, polyp^ptiUe, or protein; an epitope can comprise 3 
or more amino acids in a spatial oonformation unique to the 
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apitop*. .C«i«»ally, an epitope consists or dt least 5 such 
aaino acids and. more usually, consists of at least 8-io 
su=h anino aeide. Methods of determining spatial 
conforaation of anino acids ara known in th« art and 
inclBde, for -xaapla, x-ray crystallography and 2- 
difflcnsional nuclear magnetic resonancs. Antibodias that 
recognize the eaaa epitope can b« identiried in a simple 
iimunoassay showing the ahility of one antibody to block the 
binding of another antibody to a target <mUqen. 

-Treatment, " as used herein, refers to prophylastia 
and/or therapy (i.e., the modulation of any disease 
symptoms). An "Individual" indloAtes an animal that is 
susceptible to infection by n. ovln^-i a,,a includes, but is 
not limitud to. primates, including humans. A "vaeeine" is 
an immunogenic, or otherwise capable or eliciting protection 
against a^_E3tl.orl, Whether partial or eomplote, oompoeition 
useful for treatnent of an individual. 

The fi^_Eyl0Ei proteins may be used for producing 
antibodies, either monoclonal or polyclonal . specific to the 
20 proteins. The methods tor produoing these antibodies are 
known in the art. 

"Recumbinant host cells", "host cells," "cells," 
"can cultures," and other such terms denote, for. example, 
microorganisms, insect Cells, and mammalian cells, that can 
be, or havft been, used as rcoipients for recombinant vector 
or other transfer DNA, and include the progeny of the 
original cell which has been transformed, it is understood 
that the progeny of a single parental cell may not 
necessarily be completely identical in morphology or in 
genomic or total DHA implement as the original parent, due 
to natural, accidental, or deliberate mutation. Examples 
for mammalian host cells include Chinese hamster ovary (CHO) 
and monJcey kidney (rns) cells. 

Specif ickHy, as used herein, "cell line," refers 
to a population of celi5 capable of continuous or prolonged 
growth and divicion in yiJaa. Often, cell lines arA clonal 
populations derived from a single progenitor cell. it is 
furthAr known in the art that spontaneous or induced changes 
can occur in karyotype during storage or transfer of such 
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Clonal populations. Tharafore, colic derived from the call 
line referred to may not be precisely identical to the 
ancestral cells or cultures, and the cell line referred to 
inoludcs ouch variants. The term -cell lixies" also Induces 
5 Imortailzed ceils. Preferably, cell linec Include 
nonhybrid cell lines or hybridomas to only two cell types. 

AS used herein, thA term "microorganicm" includes 
prokaryotic and eukaryotic microbial species such, as 
bacteria and rungi, the latter inclxiding yeast and 
10 filamentous fungi. 

"Transrormation" , as used herein, refers to the 
insertion of an exogenous polynucleotide into a host cell,, 
irrespective or the method uK«d for the insertion, for 
example, direct uptake, transduction, r-mating or 
15 electr operation. The exogenous poTynucleotide may be 
maintained ac a non-integrated vector, for example, a 
plasmid, or alternatively, may be integrated into the host 
genome. 

By "purified" and "isolated" is meant, when 
referring to a polypeptide or nucleotide sequence, that the 
indicated molecule is present in the cubstantial absence of 
other biological macromolecules of the same type. The term 
"purified" as used herein preferably we^in^: at l^ast 75% by 
weight, more preferably at leaat fi5* by weight, more 

25 preferably still at least 95* by weight, and most preferably 
at least 98% by weight, of biological macromolecules of the 
same type preisent (but water, buffers, and other small 
molecnlftci,. especially moloculec having a molecular weight of 
leas than 1000, can be present). 

3U C- Nucleic Ae.i.d Assays 

Using as a basis the genome oi pviorj^ . poly- 
nucleotide probe55 of approximately 8 nucleotides or more can 
be prepared which hybridise with the positive strand (s) of 
tne KNA or its complfiinent, as wall ae to cDNAa. These 

35 polynucleotides serve eis probes for the detection, isolation 
and/or labeling o£ poiynucleot7ri<?s which contain nucleotide 
sequences, and/or ac primers for the transcription and/or 
roplicatiuii of the targeted sequences. Each probe contains 
a taT-geting polynuclaotide sequence, which is comprised of 

SUBSTITI ITP: Quccr-i- 
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nucl<iotid«B which are complementary to a target nucleotide 
■•guonce; the Beguence is of sufficient length and 
conpleaentarily with the cequcncc to form a duplex which has 
sufficient stability for the purpose intend Ad. For example, 
5 if the purpoee ic the isolation, via inunobiliration, of an 
aiidlyLB containing a target sequence, the probes will 
contain a polynuolcotide region which is of sufficient 
length and conplementarily to the targo^ted se^ence to 
afford sufficient duplex stability to iaaobiliie the analyte 
10 on a solid surface under the isolation cond i tions . For 
example, also, if the polynucleotide probes are to serve as 
primers for the transcription and/ or rAplieation of target 
sequences, the probee will contain a polynucleotide region 
of suiricient length and complement arily to the targeted 
.15 sequence to allow for rcplicotion. For example, also, if 
, the polynucleotide proDes are to be used as label proboc, or 
ar© to bind to Biultimcrc, the targeting polynucleotide 
region would be of sufficient length and complenentarily to 
form stable hybrid duplex structures with the label probes 
2C and/or multiraers to allow detectlor^ of the duplex. ^Thc 
probefl nay contain a lainimum of about 4 contiguous 
nucleotides which are complementary ^to the targeted 
sequence; usually the oligomers will contain a -iainimum of 
about 6 continuous nucleotides which are r.omplenentary to 
25 the- targeted sequence, and preferably will contain- a minimum 
of about 14 contiguous nucleotides which are complementary 
to the targeted sequonco. 

The probes, however, need not consist only of the 
sequencQ whdch is complementary to the targeted sequence. 
3 0 They may contain additional nucleotide sequences or other 
moieties. For example, if the probee arc to be used aa 
primers for the amplification of sequences via PGR, they may 
conraln sequences whirh, when in duplex, form restriction 
en2ymo sites which facilitate Uie cloning Of the amplrfied 
3 5 sequences, for example, also, if the probes are to be used 
as "capture probes" in hybridization assays, they vill be 
coupled to a "binding partner" as defined above. 
Preparation of the probes is by means 3cnuwn in "the art, 
including, for example, by method?; which include excision, 
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trancoription or chemical s^uUiesis. 
fij EXPregglrtTi gY«t;ancr 

Once the oppropriate H. pylori coding sequence Is 
isolated, it can bA *?xpr«sfied, in a variety of different 
Gxpreeeion syateaa; for example tUose used with mafflmallan 
cells, baculo viruses, bacteria^ and yeast. 
L: — Mammalian Svstemis 

Maamalian expression systems are known in the art. 
A mammalian promoter is any DNA sequtixice capable of binding 
inammalian RKA polymerase and initiating the downctream (3') 
transcription of a coding sequence (e.g. structural genej 
into mRWa. a promoter will havA a transcription initiating 
region, which is usually placed proximal to the 5' end of 
the coding sequence, and a TATA box, usually located 25-30 
baee pairs (bp) upstream of Uie transcription initiation 
site. The TATA bov is thou^^t to direct RNA polymerase II 
to begin RKA synthesis at the correct site, A mammalian 
promoter will also contain an upstream promoter element, 
usually located vithin 100 to 200 bp upstream of the TATA 
20 box. An upstream promotflr element dotorminec the rate at 
which tranccription is initiated and can act in either 
orientation, samhrooJc et al., Mol^eul^^ cioftin^. 
Laboratory Manual .. 2nd ed (1939). 

Mammalian viral genes ar« often highly expressed 
^nd have a broad hoot range; therefore sequences encoding 
mammalian viral genes provide particularly riseful promoter 
?;p.quencec. Examples include the SV40 early promoter, mouse 
mammary tumor virus LTK promoter, adenovirus major late 
promoter (Ad MLP) , and herpes simplex virus promoter. in 
addition, s^Huences derived from non-viral genes, such as 
the murine metallotheioncin gene, also provide useful 
promoter seguencea. Expression may be e.ither constitutive 
or regin;^tPd (inducible), depending on the prumoter can be 
induced with glucocorticoid in hormone-responsive cells. 

The pra«:ence of an enhancer element (enhancer), 
combined with the promoter elements described above, will 
usually increase expre5;c,ion levels. An enhancer' ia a 
regulatory DKA ccquence that caji i^Limulate transcription up 
to 1000-fold When iin)c^.d to homologouc or heteroloyous 
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proinotflr«, wiiA cynthecic beginning at thu normal RNA start 
«ite. Enhancer- are also active when th«y ar« placed 
upstrean or dovnctroaa froa the transcription initiation 
cite, in either normal or nipped orientation, or at a 
distance of more than looo nuolcotidas from the promoter, 
Maniatia et al., Scienve 236:1237 (1989); Albertc et al. 
MP l ftmnqr T^ioiogy of thi. rrli , 2nd ed (1989). Enhancer 
elcmenta derived from viruses may ha particularly iiocfui, 
because thpy usually have a broader host r*tnge. Examples 
incltide the 3V40 early qene enhancer, DijJcema et al (1985) 
EMHO J. 4:761, and tho enhancer/promoters derived trom the 
long terminal repeat (hTR) of the Rous Sarcoma Virue, Corman 
et al. (1982) Proc. Natl. Acad. Sci. 79:6777, and f rom hxiaan 
cytonegalovirub, Boshart et al. (1985) Call 41:522i. 
15 Additionally, somo enhancers are regulatable and become 
active only in the presence of an inducer, such ae a hormone 
or metal ion, Sassone-Corci et el. (1936) Trends Genet, 
2:215; Maniatis et al. (1987) SCiencfi 236:1237. 

A DNA molocule nay be expressed intracelluiarly in 
20 mammalian cells. A promoter sequence may be directly linked 
with the DKA molecule, in which case the first amino acid at 
the N-tenainus ur the recombinant protein will always be a 
methionine, which is encoded by the ATG start codon. if 
desired, the N-terminus nay be cleaved froa the protein by 
25 iTi vlt£s incubate nn with cyanogen bromide. 

Alternatively, loreign proteins can also be 
secreted from the cell into the growth media by creating 
ehiineric DKA molecules that encode a fusion protein 
comprised or a leader sequence fragment that provides for 
3D secretion of the foreiqn protein in mammalian cells, 
Prexerably, there are proceecing sites encoded between the 
leader fragment and the foreign gene that can be cleaved 
either In vivo or in iOteo. The leader ocguence fragiuent 
usually P.ncodes a signal peptide comprised of hydrophobic 
amino acids which direct the secretion of the protein from 
the cell. The adenoviruc tripartite leader is an example nf 
a leader sequence that provides fnr secretion of a foreign 
protein in wamoalian cellc. 

Usually, transcription termination and 
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polyad^nylation s^qu^ncec r^cognizad by aaaonalian ealla are 
regulatory regions located 3 • to the translation stop codon 
and tlixifi, togathar with the promoter elementfi, flank the 
coding sequence. The 2* terminus or the mctture mRNA is 
5 formftd by site-specific post-transcriptional cleavage and 
polyadenylation, Bimstiel et al. (1985) Cell 4i;349; 
Proudroot and Wiltelav (igsa) "Term 4 nation and 3» end 
proceaaing of eukaryotic RHA- Xn Transcri«t.i on and ^i ^^^'^l 
(ed. B.D. Hames and D.M. Glover^; Proudfoot (1989) Trends 
10 Diochem. sci. 14:105, These sequences direct the 
transcript Ton of an mRMA vhich can be translated into the 
polypeptide encoded by the DNA, Exan^les or Uanscription 
terminator/polyadenylation signals include thoso derived 
from SV40, Sambrook et al (1969), Molecular cloning: 
15 Laboratory Manna 1 . 

Some genes may be expressed more efficiently when 
inrrons (also called intervening sequences:) are present. 
Several cZ5NAs, however, have been efficiently expressed from 
vectors that lacJc spljcl.ng signals (also called splice donor 

20 and acceptor sites), see e.g., Gething and SambrooX (1981) 
Naturft 2Q.T!fi20. Tntrons are intervening noncoding sequences 
within a coding sequence that contain^ splice donor and 
acceptor sites. They are removed by a process called 
"splicing," following polyadenylation of the primary 

:5 transcript, Nevins (1983) Annu. Rev. Biochem. 52:441; Green 
(1986) Annu. Rev. Genet. 20:671; Padgett et al. (1986) Annu. 
Kev. Biochem. 35:iii9; Krainer and Maniatis? (1988) "rna 
splicing," In Transcription and splicing (cd. D.D. Hames and 
D.M. Glover) , 

0 tJcually, the above-described components, 

comprising a promoter, polyadenylation signal, and 
transcription termination sequence arc put together into 
tixpresaion constructs. Enhancers, Introns with functional 
splic* donor and acceptor sites, and loader sequences may 

5 also be included in an expression construct, if desired. 
Expression consti-uctR ^rot often maintained in a replicon, 
such as an extrachromosomal element (e.g., plaaiuids) capable 
ot stable mainrenance in a host, such as mammalian cells or 
bacteria. Mammalian replication systems include those 
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derived frcm enimal viruses, vhlcn require- trana-acting 
ractors to replicate. For oxamplc, plasnids containing the 
replication Bystems or papovavlrusea , such as .W40, Cluzaan 
(1981) Call 22! 175, or polyoaaviruc, replicate to extruaely 
high copy nuaber in the presence of the appropriata viral T 
antigen- Aa<3itional evanplec of aamaalian rcplicona include 
thoce derived from bovine papillomavirus and Epstein-Barr 
virus. Additionally, the replicon may have two replication 
cyetcma, thus alloving it to be maintained, tor example, in 
manaalian cells for expression and in a procaryotic host for 
cloning and amplification. Examples of such mammalian- 
bacteria shuttle vectors include pMT2, Kaufman et al. (1989) 
Mol. Cell. Did. 9:946, and pHEBO, iJhiml2u et al. (1986) 
MOl. cell. Biol. 6:1074, 

The transformation procedure used depenrtfi upon the 
host to be transformed, Methodo for introduction of 
heterologous polynucleotides into mammalian cells are toown 
in the art and include desrtran-mediated transf ection , 
calcium phosphate precipitation, polybren« mediated 
20 transrection, protoplast fusion, electropor ation . 
encapculation of the polynucleotide (s) in lipoRomes, and 
direct microinjection of the DNA into nuclei. 

Hanmalian cell lines available as hqsts for 
expression are known in the art and include many immortal- 
ised cell lines available from the American Type Culture 
Collection (ATCC) , including but not limited to, CSiinese 
hamster ovary (CHO) cella, HeLa cells, baby hamster Vidney 
(BHK) cells, mon)cey kidney cp.IIs (COS) , human hepatocellular 
carcinoma colic (e.g., Hep G2) , and a number of other cell 
3 0 lines . 

ii^ Baculovirris Sygtew 

The polynucleotide encoding the protein can also 
b« inserted into a suitable insect expresaiun vector, and is 
operably linked to the control elements vithin tha.t vector, 
vector construction employe techniques which are known in 
the art. 

Generally, the components of the expression system 
include a transfer vector, usually a bacterial - plasmid, 
which contains both a fragment of the baculovirus genome. 
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and & uuxivenienx resrrlctlon site for insertion ot the 
hftt<arologous gAn« or genes to be expressed; a wild typa 
baculovirus witn a sequenca homologous to th« baculovirue- 
specific fragnont in the transfer vector (this tillows for 
the hojoologous recombination of the hetarologous gana in to 
the baeulovirufl genome) ; and appropriate insect host cells 
euid grovtU jcedia. 

Aftor inserting the DKA sequence encoding the 
protein intu the trmsfer vector, the vector and tho viid 
type viral genome are tranfifected into an ineect host cell 
where the vector and viral genome are allowed to recombine. 
The packaged rprombinant virus ic exprccccd and recombinant 
plaques are idantified and puriried. Materials and methodfl 
ror baculovirus/incoct cell expression systems are 
15 commercially available in Kit form from, inter alia . 
invitrogfin, San Diego CA ("MaxBac" kit) . These techniques 
are generally known to those skilled in the art and fully 
described in Summers and Smith, Texas Agricultural 
Experiment Station Bulletin No. 1555 (1987) (hereinafter 
20 "Summers and Smith") . 

Prior to inserting the DNA sequence p.ncoding the 
protein into the baculoviruc genome, the above-described 
components, comprising a promoter, leader (if • dpsired) , 
coding sp.qnence of intorofit, and transcription termination 
25 sequence, are usually assembled into an intarmediato 
transplacement construct (transfer vector) . This construct 
may contain a single gene and aperably linked regulatory 
elements; multiple genefl, <¥ach with itc owned set of 
operably linked regulatory elements; or multiple genftcs, 
3 0 regulated by the sain« set of regulatory elements. 
Intermediate transplacement conatructs are otten maintaln^^d 
in a replicon, such as an extraehromocomal clement (e,g», 
placmidfi) capable of stable mainteuaiiue in a Host, such as 
a ba^Lerium. The replicon will hav« a replication cyotea, 
'^^ thus allowing it to ba maintained in a suitable host tor 
cloning and amplifityition. 

Currently, the most commonly used transfer vector 
for introducing foitiign genes into AcNPV is pAc37S. Many 
other vectors, known to thooe of ckill In the art, have also 
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been designed. These Includft, ror exanple, pVL985 (which 
alters the polyhodrin ctart oodon from ATG to AIT/ and which 
intiroducM a BamHI cloning sico 32 bafiapairs dox^nstreaa from 
tha ATT; Luckow and Suaam, Virology (1969) X7;3i. 

The placmid usually also contains th© polyhedron 
polyadenylation signal (Miller ct al, (1988) Ann. Rev. 
Microbiol., 42:177) and a procaiyotic ample tllin-refiistancfi 
(aan) g«na and origin of replication for selection and 
propagation in E. coll . 

Kaculovirus tranfifor vectors usually contain a 
baculovirua promoter. A baculovlrus promoter is any DNA 
■ sequenrA capable of binding a baoulovirus RNA polymerase and 
initiating the dovmstream (S» to 3') transcription of a 
coding sequence (e.g. structural gene) into mRWA. A 
15 promoter will have a transcription initiation region which 
is usually placed proximal to the 5- end o£ the coding 
sequence. Tills transcription Initiation region usually 
IncludftR an RNA polyaerace binding site and a transcription 
initiation site. a baculovlrus transfer vector may also 
20 have 7t Ri?cond domain called an enhancer, which,- ir present, 
ia usually distal to the structural gene. Tlypression may ba 
either regulated or constitutive. 

Structural qenes, abundantly transcribed at late 
times in a viral infection cyolc, provide particularly 
25 uccful promoter sequences. Examples include sequoncec 
derived from the gene encoding the viral polyhedron protein, 
Frieccn et al. , (1986) "The Regulation of Baculovirua Gene 
Expression," in: ThA Molpru^ay Bioltpgy of Baculovi^s^« (ed. 
Walter Docrf ler) ; EPO Fubl. Nos, 127 83S> and 155 d76; and 
30 the gene encoding tha pio protein, Vlak et ol., (1988)^ J. 
Cen, Virol. 69:765. 

DNA encoding sultjible signal eequcncea can be 
derived from genes ■ for secreted insect or baculovlrus 
proteins, such as the Oaculovinis polyhedrin gene (Carbonell 
35 et al. (1988) Gene, 73:409). Alternatively, since the 
signals* lor mammalian cell posttranRl^tional modifications 
(such as signal peptide cleavage, proteolytic cleavage, and - 
phosphorylatiuii) appear to be recogni red by insect cells, 
and the aignals required for secretion and nuclear 
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dcyumultttion also appaar ro be consarvod batram thm 
invortohrato callc and vertebrate cells, leaders or non- 
Insact origin, such as tnosa dorived rrow genes eneoding 
hunaa a-intarferon, Maeda ct al., (1905), Nature 315:592; 
5 Uuffliui gastrin-releasing peptide, Lebacq-VarhAyden «t al,, 
(1988), Molec. Call. Biol, 8:3129; huaon IL-2, Smith et al., 
(1985) Proc. Kafl Acad. Sd. USA, ff2:8404; naiinf lL-3, 
(Miyajina et al., (1987) Gene 50:273; and human 
cflucocerebrosidase, Martin et al. (1988) DNA 7:&9, can also 
10 b« used to provide for eacretion in insects. 

A recombinant polypeptide or polyprotain may be 
^scproBsed intracollularly or, if it i» expressed with the 
proper regulatory sequences, it can be seerated. Good 
intracellular expreeeion of nonf^ed^ foreign proteins 
15 usually reguires heterologous genes that Ideally have a 
short leader sequence containing suitable translation 
initiaLlon signals preceding an ATG start signal. If 
desired, methionine at the K-tcrminua nay be cleaved £rom 
the mature protein by In vitro incubation vith cyanogen 
20 brormldft. 

Alternatively, recombinant . polyproteins or 
proteins which are not naturally secreted can be secreted 
from the insect cell by creating chimeric DNA moleq^les that 
encode a fusion protein comprised of a leader sequence 
25 fragment that provides Xwr secretion of the foreign protein 
in insects. The leader sequence fragment usually encodes a 
signal peptide comprised of hydrophobic amino acids wh-feh 
direct the translocati on of the protein into the endoplasmic 
reticulum. 

Afrer insertion of the Dna sequence and/or the 
gene encoding the escpression product precursor of the 
protein, an insect cell host is co-transformed with the 
heterologouc DNA of the transfer vac Lor <md the genomic DNA 
or wild type baculovirus — usually by co- trans faction. The 
35 promoter and transcription termination aequence Of the 
construct will usually comprise a 2-51cb section of the 
baculovirufi genome. Methodo for introducing huLerologous 
DNA into the desired iiite In thft bacuiovirus viirns 'are known 
In the art. (.gee Summers and Smith; Ju et al. (1987) ; Smith 
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Ct al,, Mol. Cell. Dlwl. (1983) 3:2356; and Luckow and 
fiuoiners (1989)). For axanple, the insertion can be luto a 
gene such as the polyhedrin gene, by ho^nologotis doiUjic 
crossover recombination; incartion can alao be into a 
refitriction enzyme site engineered into the desired 
baculovirus gene. Miller et al., (1989), Bioessays 4;9l, 

The PNA sequence, when cloned in placa pt the 
polyhedrin gme in the exprdeeion vector, is flanXed ioth 5- 
and 3« by polyhedrin-speciric seQ[uences and is positioned 
downstream of the polyhedrin promoter. 

The nevly rormed baculovirus^ eypr(»ssion vector is 
subseguently p;^rkaged into an infectious recombinant 
baculovirufl. Homologous recombination occurs at low 
frequency (between about i% and about 5*); thus, the 
majority of the virua produced after cotransf ection ie ctill 
Wild-type virus. Therefore, a method is- necessary to 
identify recombinant viruses. An advantage of the 
expression systeTT, is a visual eorcen allowing recombinant 
viruses to be distinguished. Tlie polyhedrin protoin, which 
is produced by ttiA native viruc, is produced at very high 
levels in the nuclei of infected cells at late times after 
viral infection- Accumulated polyhcdcin protein fonns 
occlueion bodies that al^io contain embedded pa^icloe- 
These occlusion bodies, up to is /im in size, are highly 
refractilc, giving them a bright sniny . appearance that is 
readily visualized under the light microscope. cells 
infected with recoiabinant viruses lacK occlusion bodies. To 
distinguish recombinant virus from wild-type virus, the 
transfection supernatant is plagued onto a monolayer of 
insecL cells by techniques )cnown to those 3)cilled in the 
art. Namely, the plaqueo are screened under the light 
microscope for the presencA (indicative of wild-type virus) 
nr absence (indicative of recombinant Virus) of occlusion 
bodies. "Current Protocols in Microbiology" vol. 2 (Ausubel 
et a], pds) at 16.8 (Supp. lo, 1990) ; Summers and Smith; 
Millar et al. (1989), 

Recombinant baculovirus expression vectors havp. 
been developed for infection Into several insect cells. For 
example, recombinant baculoviruscs have been developed for 
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-illtar aVfa: a^Hofi a^srrvpti , Autocrraph a calif oimiea . Bcttbv^ 
ino£i, Drc30Phila melanooaat«?r . gpodoDtera rruulaerda . and 
lYiehQPlusIa (VCT Puh. No. WO fl<3/fl46699; Carbonell «t 

al*, (1085) J. Virol. 5C:153; Wright (1986) Nature 321:718; 
5 Smith et al., (19B3) Mol. CelT. Riol. 3:2156; and soo 
generally, Praser, et al. (ise^) In Vitro Cell. Dev* Biol- 
2b: 225) . 

Cells and cell culture media are comiaercially 
available for both direct and fusion expraccion of 

10 heterologoM polypeptideo in a baculovirua /expression 
system; cell cultcra tft^hnnlogy is generally Joiown to those 
skilled in the art. See, e.g., Summers and Smith. 

The moaified insect eel 3?; may th<?n be grown in an 
appropriate nutrient mediua, -which allows Tor stable 

15 maintenance at the plasTnid(K) prASAnt in the wodifiod insect 
host. Tflhere the expression product gene is under inducible 
control, the host may bfi grovn to high density, and 
expression induced. Alternatively, where expression is 
constitutive, the product vill hA continuously expressed 

20 into the medium and the nutrient medium must be cunLinuously 
circulated, while removing the product of interest and 
augmenting depleted nutrients. The product may be purified 
by such technigues as chromatography^ ft-g-r HPLC, .affinity 
chromatogaraphy, ion exchange chroma togxapby, - et<j. ? 

25 electrophoresis; density gradient centrifugation; solvent 
extraction, or the like Ac appropriate, the product may be 
rurther purified, as required, so as to remove substantially 
any insect protoinc which are alco secreted in the medium or 
result from Ivsis of insect cells, so as to provide a 

3 0 product which is at least eubctantially free of host debris, 
e.g., proteins, lipids and polysaccharides. 

In order to obtain protain Qxpreecion, recombinant 
host cells derived from Llie LransCormants are incubated 
under conditions which ;^llow eicprossion of tha recombinant 

3 5 protein encoding sequence. These conditions ,will vary, 
dependent upon the host cell ?;*?iected. Howevor, the 
Gonditionc arc readily ascertainable to th o a ti of • ordinary 
skill in the art, based upon what is known in the art. 
iii^ Bacterial Svatema 
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Baotcrial exproeoion Ue;;hnlques art Jcftown in th« 
art, A bacterial promoter is any DNA sequence capable of 
binding bacterial UNA polymerase and Initiating the 
downstream (3") transcription of a coding ocquencc (e.g. 
5 structural gene) into wENA. A promoter will havft a 
transcription initiation region which ie ueually placed 
proximal to the 5' end of the coding sequence.. This 
transcription initiation region usually includes an rha 
polymerase binding cite and a tranccription initiation site. 
10 A bacterial promoter may also have a second domain called an 
operator, that may overlap an adjacent RNA polymerase 
binding site at which PWA synthesis begins. The operator 
permits negative regulated (inducible) transcription, as a 
gene repressor protein may bind the operator- and thereby 
15 inhibit tranecription of a specific qene. Constitutive 
expression may occur in the ab.c;i?nce of negative regulatory 
elAwents, such ae the operator. In addition, positive 
regulation may be achieved by a gene activator protein 
binding sequonco, which, if present is usually proximal (S») 
20 to the RNA polymerase binding sequence. An example of a 
gene activator protein ia the catabolite activator protein 
(CAP) , which helps initiate transcriptiop of the lac operon. 
in 7.. coli ,. Raibaud et al. (1984) Annu. Rev. Genet, 18:17:^. 
Regulated expression may thereforp be either positive or 
ZS negative, theraby either enhancing or reducing 
transcription. 

Saguencec oncoding metabolic pathway enzymes 
provide particularly usetul promoter sequoneoc. Examples 
include promoter coquenoec derived from sugar metabolizing 
30 enzymes, such as galactose, lactose flacl . Chang et al, 
(1977) Natiirp 198:1056, and maltose. Additional examples 
include promoter sequences derived from biosynthetic on2yttec 
such as tryptophan (tm) , Goeddel et al. (1900) nuc- Acids 
Res. 8:>1057; Yclverton et al. (1981) NUCl. Acids Ror. &:731? 
35 U.S. 4,738,921; EPO Publ. Mos. 036 776 and 121 775. The g- 
laotamase (kl*) promotax system, Weissmann (198.1) "The 
cloning of interferon and other mista)cafi.»» in TntgrfeEgH-l 
(ed. T. Gressor) , bacteriophage lambdd PL, ShlmataJce ot al. 
(1981) Nature 292:128. and T5, U.S. 4,689,406, prometer 
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systams also provide ucetui promotor caqn«nc«t. 

In addition, synthetic promoters wbicli do not 
occur in nature also function as bacterial promotarc. For 
cxaaplc, transcription activation sequences of one bacterial 
or isacteriophage promoter may be joined with th« op«ron 
coquoncec of another bacterial or bacteriophage promoter, 
creatinq a syntheric hybrid promoter, U.S. 4,551,43 3. For 
examplo, the tac promoter is a hybrid trp-lac promoter 
comprised of both £ee promoter and lac operon sequ-^ncos that 
is regulated by the lac reprccoor, Amaxm at al- (1983) Gene 
25:167; de Boer et al. (1983) Proc. Natl. Acad. Sci. 8O221, 
Furthermore, a bacterial promoter oan include naturally 
occurring promoters of non-bacterial origin that havA the 
ability to bind bacterial RKA polymerase and initiate 
15 trtinacription. A naturally occurring promoter of non- 
bacterial origin can also be coupled with a compatible KNA 
polymez-asfcs to produce high levels of expression or some 
genes in proJcaryotee . The bacteriophage T7 rita 

polymerase/promoter system is an example of a coupled 
20 promoter system, Studier ct al. (1986) j, Mol. Biol, 
189:113; Tabor et al. fi985) Proc Natl. Acad. Sci. »3t:1074. 
Tn addition, a hybrid promoter can alcp be comprised of a 
bacteriophage promoter and an E . coli operator -region ccpo 

_Publ..No. 267 851). 

In addition to a functioning promoter sequence, an 
efficient ribosome binding site ic also useful for the 
expression of foreign qenes in proxaryotec. In E. coT i . the 
rlhoflnm*^ binding site is callad the Shinc-Dalgamo (3D) 
sequence and includes an initiation coCon (A'VQ) and a 
sequence 3-9 nucleotides in length located 3-11 nucleotides 
upstream of the initiation codon, SlUne et al. (1975) Nature 
254:34- The SD sequence thought to promote binding of 
mSNA to the ribosome by the pairing of baaea between the SU 
sequence and rne 3' and or E. roll ifis rRNA, Steitz et al. 
35 (1979) nconetic signalc and nucleotide sequences in 
messenger RNA." In Biological Kemilari nn anri no.vAlr>T^Tn*»n't-- 
GanS — ^:ypr?ggi9n Ced. R. F. Coldbcrgcr) . To express 
eukaryotic genes aiid proKaxyotic genes with veaJc ribocome- 
bindlng site, Sambroolc et al. (1989>, Molecular Cloning- 
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A DNA nolecula aay be expressed intracellular ly. 
A proaotcr sequence may be directly Unked with the DNA 
molecule, in whi«h cas« the firat aaino acid at tlie n- 
5 terminus will alway, be a methionine, which is encoded by 
tl.« ATG start codon. if d«cired, methionine at the N- 
terainus aay be cleaved iron the protein by in vitro 
incubation with cyanogen bromide or by either in on in 

viSro inoubation with a bacterial methionine K-torminal 
10 peptidase (EtO Publ. Vo.-219 237) . - 

Fusion proteins provide an alternative to direot 
expression. Usually, a DMA sequence encoding the N-tennlnal 
portion of an endogenous bacterial protein, or other stable 
proUein, is fused to the 5- end of -heterologous coding 
sequences. Opon expression, this construct will provide a- 
fusion of the two amino acid eequcncea. For example, the 
bacteriophage laabda uell gene can be linked at the S- 
terminus of a foreign gan» and expressed in bacteria. The 
resulting fusion protein preferably retains a site for a 
processing enzyme (factor xa) to cleave the bacteriophage 
protein from the foreign gene, Nagai et ai. (1984) Nature 
309:810. Fusion proteins can also bo s(;>de with sequences 
fro» the Iac2, Jia et al. (1987) Gene 60:197, tiES,. Allen ct 
al. (1987) J. Biotechnol. 5:93; Makoff ct al. (1989) J. Gen 
25 Microbiol. 135.11, and EPO Publ. No. S2i 647, genes. The 
DNA sequence at the junction of the two amino, acid sequences 
may or it„y not encode a cleavable site. Another example i« 
a ubiquitin fusion protein. Such a fusion protein is made 
with the Hhiquitin region that preferably retains a site for 
a processing enayme (e.g. ubiquitin specif ie processing- 
protease) to cleave the ubiquitin from Uie foreign prot.ein. 
Through this method, native foreign protein ean bo iaolated. 
Miller et al . (1989) Bio/Teohnology 7:698. 

Alternatively, foreign proteins can also be 
secreted from the c,ll by creating chimeric DNA molecules 
that encode a fusion protein comprised of ^» signal peptide 
sequence fragment that provides for ccoretion of the foreign 
prn^,in in bacteria, U.S. 4,336,336. Tne signal sequence 
fragment u.ually encoder a signal peptide comprised of 
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hydrophobic emino acids which direct the ae;;x-etion or the 
protein from the cell. The protein is either sQcratad into 
th« growth media (graa-positive bacteria) or into the 
periplasmic space, located betvAftn th^ inner and outor 
membrane of the cell (sfram-negative bacteria) . Preferably 
there are processing sites, vhich can cleaved either i^ 
vivo or in vitro encoded betveen the signal peptide Iragment 
and the foreign gene. 

DNA encoding suitable aignel sequences caa be 
derived from genes for secreted bacterial proteins, such as 
the E.._co3,i outer membrane protein gene f ompA ^ . Masui et al. 
(19S3), in; KynprTTTif^ni-?!! Mani pui ati nn nf Cona Evsrecc i r>Ti . 
Ghrayeb et al. (1904) EMDO J, 3:2437 and the E. coli 
aucaline phospnatase signal sequence f pho& i . oka et al. 
(1985) Proc, Natl. Acad. Sci. 62:7212, As an additionul 
example, the signal sequence of the alpha-awyiase gene froa 
variouc Baoilluo strains can be uaed to secrete heterologous* 
proteins from fi. sattiiia. Palva et al. (iqftS) Proc. Natl. 
Acad. Sci. USA 79;5532; EPO Publ. No. 244 042. 

usually, transcription termination sequences 
recognized by bacteria arc regulatory regions located 3* to 
the translation stop codon, and thus together with tho 
promoter flank the coding ccqucncc. Theae sequences direct 
the transcription of an mRNA which can be translated into 
the polypeptide encoded by the DNA. Transcription 
termination bequences frequently include UNA sequences 
about 50 nueleotidec capable of forming stem loop structures 
that did in terminatinq transcription. ^--xamples include 
transcription termination sequences derived from genes with 
strong promoters, sucU <is Lhe trp qene in E. colt as well as 
othftr biosynthetic genes. 

Usually, the above-described components, 
comprising a proinotPr, sign^^l sequence (if desired), coding 
sequence of interest, and transcription termination 
sequence, are put together into expression conctructe. 
Expression constructs are often maintained in a renlicon, 
sucn as an extrachromosomai element (e.g,, plasaids) capable 
of Etable maintenance in a host, such as bacterid. The 
replicon will have a replication system, thus allowing it to 
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be maintained in a procaryotic host either ror expression or 
ror cloning and amplification, in addition, a replicon may 
be either a high or low copy nimljer plasmlfl. A high copy 
numb«r plasmid will generally have a copy number ranging 
^ from about 5 to about 200, and usually about i a to ahout 
150. A hust containing a high copy number placmid will 
preferably contain at least about 10, and more preferably at 
least about 20 plasjnids. Either a high or low copy* number 
vector may he celcoted, depending upon the effect of the 
10 vector and the foreign protein on the host. 

Alternatively, the expression constructs can b© 
integrated into the bacterial genome with an integrating 
vector. Integrating vectors usually contain at least onf% 
sequence homologous to the bacterial chromocome that allows 
15 the vector to integrate. Integrations appear to result from 
recombinations between homologouB DNA in the vector and the 
bacterial chromooome. For example, integrating vectors 
■constructed with UNA from various Bacillue strains integrate 
into, the Bacillus chromosome (EPO Publ. No. 127 328). 
20 Integrating vectors may alsn compricod of bacteriophage 
or transpocon sequences. 

Usually, exrrachromosomal ^and integrating 
expression constructs may contain selectable markers to 
allow for the selection of bacterial strains that have been 
25 transf ormftd . Selectable markers can be expressed In the 
bacterial host and may include genes which render bacteria 
resistant to drugs cuoh as ampicillin, chloramphenicol, 
erythromycin, kanamycln (neomycin), and tetracycline. Daviee 
et al. (T978) Annu. Rev. Microbiol. 32:469. Selectable 
markers may also include biosynthetic genes, such as those 
in the histidine, tryptophan, and leucine biosynthetic 
pathways . 

Altemativply, some of the above-deaurlbed 
components can be put toyeLher in transformation vectors. 
Transformation" vectors ;%rp usually comprised of a seleciiable 
marker that ic either maintained in a replicon or developed 
into an integrating vector. 

Expression and transf ormatiun vectors, either 
extra-chromosomal repllcons or integrating vectors, have 
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been developed for transrurincition Into many Mctaria. For 
exajaplft, expression vectors hava boon developed for, in£s£ 
Siia, the following bacteria: Bacillus 5TH p>;gj^3,is, palv at. 
al, (1982) Froc. Nat]. Acad. Sci. USA 79:5582/ ETO Publ. 
Noc. 03e 259 and 063 953; PCT Publ. Nw. WO 84/04641; £^ 
coli. snimataxe et al. (1981) Kature 292:128; Amann et al, 
(1985) Gene 40il83; Ctudier et (1986) J. Mol. Hiol. 

189:113; EPO Publ. Nos. 035 776, 136 829 and 136 907; 
St i:eptoooeeTa3 eremoy . i^ , Powell et al. (198©) Appl. environ. 
Microbiol. 54 ; 655 ; -Straptofin ccus lividanc , Powell et al. 
(1988) Appl, Environ. Microbiol. 54:655; and SrreptoTnyr.PQ 
livldans^ U.S. 4,745,056. 

Methods of introducing exogenous DNA into 
-bacterial hosts are well-Jmown in the art, and usually 
IS include either the transformation of bacteria treared with 
Caci^ or other agents, such as divalent cations and DMSO. 
ONA can also be introduced into bacterial cells by 
electroporation. Transformation procodurac usually vary 
with the bacterial species to be transformed. b'ee, e.g., 
20 Masson er al. (1989) FEMS Mir.robiol. Lett. 60:273; Palva et 
al, (1982) Proc. Katl. Acad. Sci. USA 79:5582; EPO Publ. 
Nos. 036 259 and 063 9b3; PCT Publ. No^ WO 84/04541, for 
RArillus ; Millar et al. (1986) Proc. Natl. Acad. Sci. 
85:856; Wang et al. (1990) J. BactP.riolT - 172 : 949 , for 
2t> gafflpY).f^^g^cte3r; Cohan et al. (1D73) Proc. Natl. Acad. Sci. 
G9;2110; Dower et al, (1988) WUCleic Ar.ids Res, l€i6127; 
Kushner (1978) "An improved method for transformation of E^ 
coli with ColZl-derived plasmlds," In Genftt>r. ^ainaQy-i pcr* 
Pronpedinqfft of the International S ymposium on r,«»n^^■ ^n 
30 Engineering (eds. H.W. Boyer and a. Nicosia); Mandel ot al* 
(1970) J. Kol. Biol. 53:159; Takcto (1900) Biochim. Biophvs. 
Acta 949:318, for Esch^iir jtcfajg;.; Chassy et al. (1987) PEHS 
Microbiol. Lett. -ia:1.73, for Laetobaoillu^ . Fiedler et <tl. 
(1988) Anal. Biochem 170; 38, for pseud omonas : Augustin et 
3S al. (1990) FEMS Mlcrobioi . Lett. 66:203, for Staohvlo^o^^,^ ■ 
Barany et al. (1980) j. Bacteriol, 144:698; Harlander (1987) 
"Transrormation of srr&ptocnrrng T;.r-i--i<= Ky olwctroporationy 
in: .«lt.r P.ptococcal Canetic= (ed, J. FerxeULi and R. Curtis^ 
III); Perry et <tl. fi981) intec. XTnmun. 32:1295; Powell et 
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al. (1988) AppT. Environ. Microbiol. 54;6S3; SoflOcutl fit ai, 
(1987) Proc, 4Ui Evr. Cong. Biotechnology 12.112, for 

Yeast Expression 

Yeast expression eyetcma are also Xnovn to one ot 
ordinary skill in the art. A yeast promoter is any DHA 
sequence capable of binding yeast RNA polymerase and 
initiating the downstream (3') transcription of a coding 
sequence (e.g. etnictural gono) into miWA- A pruinoter will 
have a transcription initiation region which is usually 
placed proximal to the 5» ond of the coding sequence, mis 
transcription initiation region usually includoe an RNA 
polymerase binding site (the "TATA Dox") and a transcription 
initiation site. A yeast promoter may aleo havo a occond 
domain called an upstraam activator sequence (UAS) , which, 
if present,^ is usually distal to the structmral gene, ' The 
UAS perraits regulated (inducible) expression. Constiturlve 
©xpreeeion occurs in th« absence or a UAs! Regulated 
expression aay be either positive or negative/ therel^y 
aither enhancing or reducanq transcription. 

yeast is a fermenting organism with an active 
metabolic pathway, therefore sequences ^coding enzynec in 
the metabolic pathway provide particularly useful . promoter 
sequencee. Examples include alcohol dehydrogenase (ADH) 
(E?0 Publ. NO, 284 OAA), pnolasa, gluaokinase, gluooae-©- 
phoaphate ieomeraflc, glyceraldehyde-3-phoaphata- 
dehydi-ogeaase (GAP or GAPDH) , hexokinase, 
phosphofructokinace, 3-phosphoglycerate mutase, and pyruvate 
kinase (PyK) fEPO PUbl. No. 359 203), The yeast PHOS gene, 
encoding acid phosphatase, also piuvides useful promoter 
sequences, Myanohara et al. (1983) Proc. Natl. Acad. Sci. 
USA a0:l. 

In addition, synthetic promoters which do not 
occur in nature*, also function aa yeast promoters. For 
oxamplc, UAC sequences uf one yeast promoter may be joined 
with the transcription activation region of emother yeast 
promoter; creating a synthetic hybrid promoter. Examplec of 
such hybrid promoters include the ADH regulatory sequence 
linked to the GAP transcription activatinn region (U.S. 
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4,876,197 and U.S. 4,880,734). Oth«r axawplAJ? of hybrid 
promoters include promot«r» which constat of the regulatory 
secjuencefi of either the ADRS,/ SMA^ (^ALlSr or PH05 genes, 
combined vith the transcriptional activation region ot & 
5 glycolytic enzyme gene such as GAP or PyTT (EPO Publ, No. 164 
556). Furthermore, a yeast promoter can include naturally 
occurring promoters ot non^yeasr origin that have thA 
ability to bind yeast RK& polymerase and initiate 
transcription. Examples of such promoters include, intP-r 

.10 alia , Cohen et al. (1980) Proe. Natl. Aoad. Sci. USA 
77:1078; HeniXorr et al, (1981) Nature 283:835; Hollenberg 
et al. (1981) Curr. Topics Kicrobiol. Immunol. 96:119; 
Hollenherg et al, (1979) "The Expression of Bacterial 
Antibiotic Recictanco Geneo in the Yeast Saccbaromyces 

13 cerevislae," In: Plasmifls ot Medical. KnvlrnnTn^^ntai aT^rf 
CommQrcial Importance (edc* K.N. Timais and A. Puhler) ; 
Mercerau-Puigalon et al. (1980) Gene ii:i63; vanthier et al. 
(1980) curr. Genet. 2:109. 

A DNA molecule may be expressed intracellularly in 

20 yeaet. A promoter ccqucnoc may be directly linked vith the 
DNA molecule, in which case the first amino acid at the 
tonainiic of the recoinbinant protein yill always bo a 
methionine, which is encoded by the atg start codon. Tf 
desired, methionine at the K-terminue may be olcaved from 

25 tiie pruLtiin by in vitro incubation with cyanogen broiftide. 

Fusion protoine provide an alternative for yeast 
expression systems, as well as in mammalian, baculovlrus, 
and bacterial expression syctems. Usually, a DHA sequence 
encoding the N-Lurmlnal portion of an endogenous yeast 

30 protein, or other S5tah1« prct<?in, is fused to the S» ond of 
heterologous coding sequences. Upon expression, this 
construct will provide a fusion off the two amino acid 
ccqacnoec. For example, the yeast or human superoxide 
dlsmutase (SOD) gene, can be linJced at the Fi» tp.rminixs of a 

35 foreign gene and expressed in yea&t. The DNA sequence at 
the function ot the two amino acid sequences may or may not 
encode a elaavablQ cite. Eec e.g. , EPO Publ. No. 196 056. 
AiiothwL wAciifltJltt is a ubiquitln ruslon protein. Such a 
fusion protein is made with the ubiquitin region that 
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pr.tGrably retains a sito for a proccaaing •na^yme (^.y. 
ulMquitin-speci£ic processing protease) to clMve. tho 
uUiguitin from th« foreign protoin. Through this method, 
therefore, native foreign protein can be isolated 
e.g., PCT Publ, Ko. wo a8/024O€6) . 

Alternatively, foreign proteins can also be 
secreted from the cell into tho growth aedia by creating 
chimeric DNA molecules that encoae a fusion protein 
comprised of a 1fi;irf»?r sequence fragment that provide for 
ccoretion in yeast of the foreign protein. Preferably, 
there are processing sites encoded between the leader 
fragment and the foreign gene that can be cleaved either in 
YiVfl or ifl vitro. The leader eeguencc fragment usually 
cnoodea a signal peptide comprised of hydrophobic amino 
acids Which direct the secretion of the protein from the 
oell. 

nWA encoding suitable cignal aeguences can be 
derived from genes for secreted yeast proteins, such as the 
yeast invertase gene (EPO Publ. No. 012 873; JPO Publ, No. 
20 62,096, OaC) and Uie A-factor gene (U.S. 4,588,6S4). 
Alternatively, leaders of • non-yeaot origin, such as an 
interferon leader, exist thai also provldte tor secretion in 
yeast (EPO Publ. No. Ofio 057). 

A preferred class of secretion leathers are those 
that e^iploy a fragment nf the yeast alpha-factor gene, vhich 
contains both a "pre" signal sequence, and a "pro»« region . 
The types ot alpha-factor fragmonte that can be employed 
include the full-length pre-pro alpha factor leader (about 
83 amino acid residues) as well as tr^mcated alpha-factor 
leaders (usually about 25 to about 50 amino acid residues) 
(U.S. 4,546,083 and U.S. 4, 870, DOR; EPO Publ. No, 324 274), 
Additional leaders employing an olpha-f actor leader fragment 
that provides for secretion ineludA hybrid alpha-factor 
leaders made with a preeegucnce of a first yeast, but a pro- 
35 region from a secund yeast alphafactor. (S©e o.g., pcT 
Publ. No. Wn 89/02463.) 

Usually, transcriotion termination sequences- 
recognized by yeast Ar« regulatory regions located 3 • Lo the 
translation stop codon, and thus together with tho. promoter 

SUBSTITUTE SHFPT 



wo 93/18150 



Pcr/Ews/oom 



36 



flawe the coding „gu«c,. Th..e »e^„c« <iixect r^^ 
tra^coription of an aswA which can fie translatod into tha 

t.™t„«tor ,..^.„oe ana other yca.t-recogni.ea ter»l„at^o„ 
sequences, „ch as those coding for glycolytic ensyaes 

n«u*lly, th. above-described coapon«,t., 

toVe^ : termination sequence, are p„t 

together- into expression constructs. E^^rocaion ocnatructs 

^ ' - t 

exteachroMosomal element (e.g., pias»id*) capable of stable 
maintenance « a ho=t, 3uch a, yeast or bacteria. The 
repl^con «ay have two replication .y.tems. thue alioving it 
to be maintained, .or example, i„ .east .or «^ressi^" 

« Z y-^^-^-^^i" '»»-ttle vectors include 

^17, Stwchco^O, et al. (1982) J. Mol. Biol. 15«:is7 In 
addition, a replieon »ay be cither a high or low copy number 
Plasmid. A high copy number pias^irl vill g»„«ally have a 

^a^^ * '"'^ "ntaimng a high cbp.y 

TT J preferably have at least about 10, and .ore 
preferably at l«.t .^out .0. , nigh or lev copy number 
vector »ay be ««lect,a, depending upon the eCXect or t^e 
vector and the foreign protein on the host. 

Alternatively, tho expression constructs can be 
xntegrated into the yeast genome With an integrating vector 
integrating vectors usually contain at least one sequence 
homologous to a yeast chromosome that allow, the vector to 

^'^^n.^ '"'T'" ^o.o^.,ou. sequences 

fl*n^.xng the expression co...Lruct. integrations appear to 
.-esult from recombinations V,.tw„n homologous Dka in%^e 
vector and the yoa.t chromosome, or.-Weaver et al. JlsT, 

bnirLwT°°'' = ---mating vector may 

aLl r\ ■ y*"^ selecting the 

appropriate homologous sequence for inclusion in the veltor 
un. or ™ore .xpre„lo„ construct may integrate, po..iAy 
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Bio/Technology 8:135; Pichi^ maiy ^^4^^^^.' .-^ , Kun£e et al. 
(1985) J, Basic Microbiol. 23:141; Elchi a pastorls . Cregg, 
et al. .(1985) Mol. cell. Biol. 5:3376; V.S. 4,837,148 and 
4,92Sr,555; aaccharomyces Ceirtivi>iaft, HllUien et al. 
5 (1978) Proc. Natl. Acad. Sci. USA 75:1329; Ito ct al. (1983) 
J, Bacterid, 153; 163; gchizosaccharr^^yc es Dombp . Beach et: 
al. (1981) NaturA 300:70f:; and yaxrovia lip el yt^ ^.^... , Davidow, 
et al. (1985) curr. Genet. 10; 380471 Gaillardin, er al- 
(1985) curr. Genet. 10:49. 
^° Methods of introducing exogenous DNA inro yeast 

hosts are well-)cnovn in the art, and usually include either 
the transformation of spheroplasts or of Intact yeast cells 
treated with alkali cations. Trancformation procedures 
usually vary with the yeast species to be transfcrmed . s^.e 
15 e.g., ICurtz fit ^l . (igss) Wol. Call. Biol. 6rl42; Kunzc et 
al. (1985) J. Basic Microbiol. 25:141, for Candida; Gieeson 
et al. (1986) J. fie-n. Microbioy. 132j3459; Roggcinkaap et al. 
(1986) Mol, Gen. Genet. 202:302, for Han_senula; iJas et al . 
(1984) J. Bacteriol, 158:1165; Da Louvencourt at al. (1983) 
20 J. Bacteriol. 1D4:1165; Vein den Berg et ai. (1990) 
Blo/Tachnology 8:135, for KluwaroTnyeq*- ■ Cregg et al. (1985) 
Mol. Cell. Diol. 5:3376; Kun^e et al^ (1986) J. Basic 
Microbiol, 25:141; U.S. . 4,fl37, i48 and U.S. >l, 929^55, for 
£ichia; Hinncn ct al. (1978) Proc. NaLl, Acad. scl. USA 
25 75 71929; Ito fit al. (1983) J. Bactariol. 153:163, for 
SaeehayoTftyoeg^; Beach et al. (1981) Nature 300;706, for 
.s.chizosaecnarpinycss; Davidow ©t al. (1985) Curr. Genet. 
10:39; Gaillardin et al. (1985) curr. Genet. 10:4^, for 

30 

Ej^_VaccjLnes 

Each of the E. pylori proteins ai<*cuseed herein 
may ba u:9ed as a sole vaccine candid At« or in combination 
with one or aor« other antigen;^, the latter either from ^ 
35 pylssi or other pathoqenic sources. Prp.f erred are 
"cocictail" vacdnpc coaprising, for example, the cytotoxixi 
(CT) antigen, the CAI protein, and the urease. 
Additionally, the hsp r;%n be added to one or more- of these 
components. These vaccines mdy either be prophylactic (to 
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Stifflulon"» (Caabrid^a Bioccicnoc, Worcester, PIA) may be uaed 
or particles generated therefrom sucti as iscoMc 
(immunoctittulating oomplcxaa) ; (A) complete FreiiiKis Adjuvant 
CcrA) and incomplete i-reunds Adjuvant (TFA.) ; (5) cytokinac, 
5 35uch as interleuJcins (IL-1, IL-2, etc. )/ macrophage colony 
stimulating ractor CM-CSF) , tumor necrosiPi factor (TNF) , 
ote; and (€} other subatancea that act as immunostimulating 
agents to enhance the erfectivenesa of tho composition . 
Alun and KF59 ara preferred « 
10 As mentioned above, muramyl peptides include, hnt 

ar© not limited to, N-acetyl-muramyl-L-threonyl-D- 
isoglutamine CLhr-MDP) , N-acetyl-normuramyi*L-alany3-D-iso- 
glutamine (nor-MDP) , N-acetyimuramyl-L-a J anyl-D- 
isog-lutaminyl-L-alanine-2- f i ' -2 • -dipalmitoyl-sn-giycerQ-3- 
15 huydroxyphosphoryloxy} -etliylamine (MTP-PE) , atc. 

Th© iiianunogcnic compoeitionc (e.g., the antigen, 
pharmaceutically acceptable carrier, and adjuvant) typically 
will contain diln^nt 5, cuch as wator, saline, glycerol, 
ethanol, etc. Additionally, auxiliary substances, such as 
20 wetting or emulsifying agents, pH buffering substances, and 
the like, may be present in such vehicles. 

Typically, the immunogenic ' compositiona are 
prepared as injectable^, either as liquid solutions or 
suspenfslOTiR; solid forms suitable for solution in, or 
25 auapension in, liquid vehicles prior to injection may also 
be prepared. The preparation also may be emulsified or 
encapsulated in liposomes for eiilianced adiuvant effect, as 
discussed above under pharmaccutically acceptable carriers. 

^0 Immunogenic compositions used as vaccines eompricc 

an immunologically effective amount of the antigenic 
polypeptides, as well as any othftr of the above-mentioned 
components,, as needed. By "'iauauno logically effective 
amoujit", iU i» meant that the administration of that amount 
to ?»Ti individual, eithar in a single doee or as part of a 
aeries, is effective for treatment or prevention. This 
amount varies rfAp*,nding upon the health and physical 
oondition of the individual to be treated, the tavonomic 
group of individual to he treated (e.g., nonhuman primate, 
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prinatc, etc), the capacity of th»s ijidlvldual's imnime 
systfta to synthesize antihodifii;, the dcgroQ of protection 
desired, the formulation of the vaccine, the treating 
aoctor'S assessment of the medical situation, and other rel- 
evant factors. It is expected that the amount will fall in 
a relatively broad range that can be determined through 
routine trials. 

The immunogenic compositions are conventionally 
administered porenterally, e.g., by injection, either subcu- 
taneously or intramuseularly. . Additional formulationc 
suitable for other modes of administration include oral and 
' pulmonary t'ormu.lationR, -*?nppositories, and tranedcraal 
applications. Oral formulations are most preferred for the. 
Hi PYlorl proteins. Dosagp treatment may -be a single dose 
15 schedxile or a multiple dose schedule. The vaccine may be 
adinin 3 stared in conjunction with other iamunorcgulatory 
agents « 

T: TTIm^nnf^di agnostic Assayg 

PYlof A antigens can be used in immunoassays to 
rtptpct antibody levels (or eonvereely H- pylori antibodies 
can be used to detect antigen levels) and correlation can be 
made with gastroduodenal disease and vit^x duodenal ulacr in 
particular. Immunoassays based on well defined, recombinant 
antigens can be developed to replace the invacivc 
25 diagnoctics methods that are used today. AxiLibgdies to H. 
PV^grl proteins within biological samples, including for 
example, blood or aerum samples, can be detected. Design of 
the immimoassays is subjer.t to a great deal of variation, 
and a variety of thesae are )cnovn in the art. Protocols for 
the immunoassay may be based, tor example, upon competition, 
or direct reaction, or sandwich type assays. Protocols may 
also, for example, use solid supports, or may' be by 
immunoprecipitation. Moct assays involve the use of labeled 
antibody or polypeptide; the labels may be, for example, 
3 5 fluoreccent, chomilumin«ecent , radioactive, or dye 
molecules. Aasays which amplify the signals from the probe 
arp also known; exampl«c of which arc assays which utilize 
biotin and avidiii, and enzyme-labeled and mediated 
immunoassays, cuch -?is ELISA assays. 
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Kits suitcOjlw for immiinodiagnosls anct containing 
tne appropri;*te. labeled reagents are conatructed by 
packaging the appropriate materials, Including tha 
compositions of the invfintion, in suitablo containers, along 
5 with the remaining rtagents and matwrxals (tor e^caaple, 
suitable buffers, salt solutions, «tc.) reguirod for the 
conduct of the aaaay, as well as suitable set of assay 
instructions. 

^0 Tha e^camplea presented below arc provided as a 

further guide to the practitioner of ordinary s)clll in the 
arc and are not to be conRtrued as liaiting the invention in 
any way. 

ii Jl. DVlori cTvtn1-r>yi n (rP) antigen 

15 Materials and methods 

for general materials and nrethode relating to H. 
p.Ylo3;i, growth and DNA isolatiuii, see sections ii and iii 
below, ralating to CAI antigen and hep, rocpcctively, 
a . Cloning 

20 Two mixtures of H#sgi?nerato oligonueleotidco were 

cyntheciaed iising an Applied Biosysteius model 380B DWA 
synthesizer, xnese mixrures were used at^a concentration of 
4 aicrottolor in a 100 laicroliter polynLerctae chain reaction 
with 200 nanograms of purified HMA using the Cenaanp PCR Jtit 

25 according to the manufacturers instructiuns . The reaction 
was incubated tor l minute at 94 degroec . centigrade , 2 
minutec at 48 degrees centigrade and 2 minutes at 56 degrees 
centigrade. The reaction mix was subjected to 30 cycleo of 
thoca conditiono. 

20 Analysis of the products of this reaction by 

agarose gol electrophoresia revealed a prominent 
approximately 87 bp DNA fragment. Aftftr digestion with the 
restriction enzymec Xbal and EcoRI, the fragmeuL was ligated 
to thtt Bluscript SK+ (Stratgene) plaRinid which had 

35 previously also boon digested with Xbal and EcoRI- The 
ligation mixture was uaed to transform competent T.. nrtli by 
electroporation at 2000V and 2 5 microfarads using (200 ^) 
BioRad Gene Pulser (California). Transformed E. eoj l were 
colected for growth on L-agajf plates containing 1.00 
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laicrograne per milliliLex ampicillln. piasmid DNA v;»s 
extracted from positive E.ggU icolatec and subjected to 
ccquonce analysis using the Seguenase 2 COnlted States 
Biochemical Corporation) DNA sequencing kit according to the 
aanufacttarera instructions. 
Preparation of libraries 
(1) Library of Hindlli Iragments 

Seven mierograas of purified DKA were digested to 
completion with the restriction enzyme Hindll. Threp. 
micrograms of Biuescript SK+ piasmid DNA were digested to 
completion with Hindlll then treated with calf intestinal 
phosphatase. Both DNA mixtures were purified by agitation 
with a water saturated phenol then precipitatAd by addition 
of ethyl alcohol to 67* V/V. Both DNAs were resuspcnded. in 
50 microliters of water. 0.7 micrograms of DNA fr?tgments 
were mixed with 0.3 nicrogramc of Bluescript DNA in 5o 
microliters of a solution containing 25 mM Tris ph 7.5, ioibm 
MgC12 and 5 units of T/l DNA ligase. This mix was inc^ated 
at 15 dcg. centigrade for 20 hours after which the DNA wac^ 
ertractAd with water caturatcd phenol and precipitated from 
ethyl alcohol. The DNA was subsequently resuspended in 50 
microL. of water. Intrc^duction of 1 tticro^ of this DNA into 
licoii by eletroporation resulted in apprcximatftly 3000- 
10,000 ampieillin resistant bacterial colonies. 
2) Library of EcoRl fraqments. 

About 0.7 jnicrog. of EcoRl digested DNA was 
purified and mixed with 0.45 micrograms of Bluescript SK+ 
piasmid which had been previously digeetcd with EcoRI and 
treated with calf intestinal phosphatase. The fragments were 
llgated in 50 m.icroL of solution. After purification and 
precipitation, the DNA was resuspended in 50 microT. of 
water. El ectropor anion of r„ roii with i mioroL of this 
solution rccultcd in approximately 200 ampieillin resistant 
bacLerial colonies. 

'^^^^"^ identify suitable resrrlctlon 

fragments* from the genome ror further cloning, the plaomid 
was uniformly labeled with 32p and used as a probe to 
analyze DNA Irom the strain CCUG digested with various 
restr1ct.inn anzymas, eepcirated on agarose gel 
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electrophoresis and transrerred to nitrocellulosft filter. 
The probe revealed a unique approximately 3.5Icb Hindlll 
restriction fragment. A liJDrary of Rjndlll digested dka 
fragmenta was prepared and cloned in the Bluescript plasoid 
vector. This llbrary was screened with 32p labeled Dim 
corresponding to the 87 bp fragnent previously cloned. Two 
clones containing identical approximately 3.3 Vrhp hindlU 
fragments were identified. DNX sequencing of these Hindlll 
rragments revealed sequences capable of coding for the 23 
amino acids corresponding to the amino terminus of the 
previoiisly described 87 JCDa cytotojfin. These sequences 
comprised part of an open reading frame of Approximately 
300 nucleotides which terminated at the extremity of the 
fragment delimited by a Hindlll restriction site. The 
sequence also revealed the existence of Er.oRI restriction 
cite within the putative open reading frame 120 bp away 
from the Hindlll site. 

A 32p labeled probe corresponding to the sequences 
between the EcoRI site and the Hindlll site was used to 
screen a library of EcoR fragments from DNA uloned in the 
Bluescript SK vector. This probe revealed two clones 
containing approximately 7.3 Xbp fragments. DNA sequencing 
of these fragments revealed a continuous opp.n reading frame 
which overlapped with the aequeneea determined from the 3.2 
top Hindlll fragments. The DNA sequence of these 
overlapping fragments and the. conceptual translation at the 
single long open reading frame contained are shown in Pigs. 
1 and 3, respectively. 

It should be noted that these cl nnAR were found to 
be oxtremoly unstable. The initial colonies identified in 
Lhe screening were so small as to be difficult to riotact. 
Expansion of thoce clonec by traditional methods of 
subcultxuiiiy for 16-18 hours resulted in very neterogeneous 
population* of plasmids due to DMA rearrangement and 
deletion. Sufficient quantities of these clones were grown 
by subcuituring for ft-io hours in the abcance of antibiotic 
cclcction. In thia fashion, although yields of plasmid were 
relatively low, selection Jtnd outgrowth of • bacteria 
containing viable rearranged plasmid were avoided. 
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e. Sorecning of DNA libraries 

The product of the PCR reaction which aontain«d 
the predominant 07 bp fragment was labeled with 32p by the 
random priming method URing th« Praa©-a-gano kit (Promcga) . 
This labeled probe waa used in a hybridiza Lion reaction With 
DNA from approximately 3000 bacteria) clones imaobili2od on 
nitrocGlltiloee filters. The hybridization rwaotion was 
carried out at 60 degrees centigrade in a solution 6f 0,3H 
NaCl. A positive bacterial clone wa:a expanded and plasmid 
DNA was prepared. The plasmid contained an insert of 
approximately 3.3)cb of DKA and was designated TOXHHl- 

A 120 bp fragmpnt containing the eoquenocc between 
poeition 202 and 410 shown in Fig. i was derived trom the 
plasmld TOXHHI and used to scr^ten approximately 400 colonies 
of the library of EcoRI fragments, A positive Clone was 
isolated which contained approximately 7.3kb of DNA 
ccgucnccs and wos designated TOXEEl. 

The nuclP.otlde sequence chown in rig, 1 was 
derived from the clones TOXHHl ^ and TOXEZl using the 
Seguenase 2 sequencing kit. The nucleotides between poaition 
1 and 410 in Fig. i were derived from TOXHHl and those 
between 2S1 and 3507 uer* derived from^TOXEEl, E, coll 
containing plasmida TOXHHl and TOXEEl have been- deposited 
with me American Type Culture Collection, eee below, 
d. Preparation of antisera against the cytotoxln 

A DNA fragment corresponding to nuclootidec ii€- 
413 of the sequence shown in Fig. i was cloned into the 
bacterial expression vector pex 34 a, such that on induction 
of the bacterial promoter, a fusion protein was produced 
which v;ontained a part or the MS2 polymerase polypeptide 
fused to the amino acidc of the cytotoxin polypeyLide and 
including the 23 amino acids previously identified. 
Approximately 200 micrograms of this fusion protein were 
partially puriritsd by . acrylamlde gel electrophoresis and 
used to immunizo rabbits by standard procedures, 

Antisera from these rabbits taken after 3 
immunizations spaced l inonth apart vac used to probe protein 
oxtractc from a cytotoxin positive and a cytotoxin negative 
strain of H. PV^grt in stand?\rd immunob lotting experiments. 
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The antisera revealed a polypeptide which migratod on 
denaturing polyaerylaaidc gel electrophoresis with an 
apparent molecular mass or lOU )cDa. This polypeptide was 
rtAtectftd in protein extracts of the cytotoxin posiLive but 
5 not the cytotoxin negative strain. Serum collected prior to 
inoBunization did not react with this polypeptide, 
e. Partial purirication or vacuolating activity 

Total H. pylori neabranes at a concentration of 6 
ag/ml were solubiliised in a solution of 1» CHAPS, 0.5 u 
10 waci, 10 mK Hepes pH 7.4, 2.5 mM EDTA, 209 sucrose for 1 
hour at 4°C. This mixtiire was then applied to a 
. discontinuoiiR sucrose gradient containing steps of 30J?r, 35%, 
40% and 55* sucrose and subjected to ultracentr if ligation for 
17 hours at 20000 x g. The gradient was fractionated and 
15 each fraction was tested for vacuolating activity and for 
urease activity. Vacuolating activity associated with urease 
activity w<l£> found in several fractions of th^ gradient. A 
pealc of vacuolating activity was also found in the topmost 
fractions of the qradient and these fractions wore 
20 esKP.ntially free of urease activity. 

This ureas e-indepenaent vacuolating activity wae 
further fractionated by stepwise precipitation with ammanium 
sulphate between concentrations of t.o 34*. -Denaturing 
polyacrylamide gal electrophoresis of the proteins 
25 precipitated aL different concentrations; of aaamonium 
sulphate revealed a predoninant polypeptide of about 100 kDa 
which cop-urified with the vacuolating activity. This 
polypeptide was recognised by the rabbit antisera raided 
against the recombinant fusion protein described above. 
30 ^. KesttJ,^ 

Two overlapping fragments corresponding to ;03out 
10 Kbp of the g- pv]nri genome have been cloned. These 
donee contain a gene, consisting of 3960 bp f shown in Fig.l) 
which is capahle of coding for a polypeptide of 1296 amino 
35 acids (shown in Fig. 2). The moleculdr weight of this 
putatiT/e polypeptide is 13 9.8 Jed. The nucleotide cequence 
AGCAAG 9 bp upstream of the incthionine codon itt poeltion 18 
in Fig.l resembles closely the consensus Shine^Dalgamo 
seguenr.A and supports the hypothesis that this methionine 
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r^proconta th« initiator methionine fox syntJiesls or tne 
polypeptide, A 30 bp nucleotide sequence which boginc lo bp 
dcwnctrosun of the putative atop codon at position 3906 In 
Fig. 1 resembles closely the the Rtnicture of proJcaryotic 
transcription torminators and ie likely to represent the end 
of the messenger RNA coding sequences. 

The cytotoxin gene is defined as coding for a 
polypeptide precursor of the H. nvlnri vacuolating activity 
by the following oritcrioj 

Ci) The putative polypeptide containc the 23 
amino acid coquence (Pig. 2, positions 34-56) identified as 
the cuflino tennlnus of the previously deecribed 87 kDa 
vaculating protein, Clover et al., j. Biol, chem, 267:10570- 
75 (1992). This sequence is prfic<?ded by-33 amino acids which 
rasemble prokaryotic leader sequences; thus/ this seguAnce 
is liXely to represent thP amino torminuc of a mature 
protein f 

(iij Rabbit antisera specific for a 100 amino 
acid fragment of the putative polypeptide containing the 
proposed aroino terminus recognized a IDO kDa polypeptide in 
a cytotoxin pocitivc but not a cytotoxin negative strain of 
H, . pylori . This lUU JtDa polypeptide^ copurifies with 
vacuolating activity from H. pvIot-t membr*ines. . 

In sum, the gene described herein codes for an 
approYimfttely 140 kDa polypeptide which is processed to a 
100 kDa polypeptide Involved In H. pvIq^^ cytotoxic 
activity. Tho 87 kDa polypeptide previously described must 
result , from either further procesRing of the lOO kDa 
polypetide or from proteolytic degradation during 
3 0 purification. 

ii- H. nvlnri CXI antia^r. 

Materiftla and methods 

a. origin of Tnat*>rials 

Clones Al, 64/4, G5 , A17 , 24 and 57/D were 
35 Obtained from the lambda gtil library, clone Di was obtained 
from a genomic plasmid library of Hindlll fragmAnts. 007 wac 
Obtained by pcr. ThP. pylpyj ctraina producing the 
cytotoxin were: CIO, G27, G29, G32. O'JJ , G39, G56, C65, 
G105, C113A. The noncytotoYir. strainc were: Gi2, G2i, <325, 
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G30, (5204 • Ttiey vere Isolarea from andoseopy biopsy 
specijnans at tho Groceeto Hocpital, (Tuacany, Italy). The 
strain. CCUG 17«74 (cytotoxin positive), was obtained from 
the culturo Collection of the Univcrcity of Cotheborg, The 
noncytotoxic strains Pylo.2D+ (urease positive) and Pyio 2tT- 
(urease negative) were obtained from F. Megraud, Centre 
nospitalier, Bordeaux f France) . E. coll strains DKlOB 
(Bethesda l^esearch Laboratorioc) , TGI, K12 delta HI delta 
trp, Y1088, yioss, yioso are Imown in the arc. Piasnnid 
Bluescript SK+ (Stratagene, La Jolla, C&) was used as a 
cloning vector. The p£X34 a, b, c plasmids for the 
eacpreseion of MS3 fueion protcinc have been previously 
described. The lambda gtll phage vector used for the 
eicpreseion library ic from the lambda gtll cloning system 
15 )cit (Bethesda Research Laboratories) . coll strains wAr*? 
matured iji LB medium (24) . H. pylori strains were plated 
onto selective media (5% horse blood, ColumBia agar base 
with Dent or Skirrow'c antibiotic supplement, o.2i' 
cyclodextrin) or in Brucella broth liguld medium containing 
20 5* ffital bovine serum (6) or 0,3% cyclodcsrtrin (25). 
b- Growth of a. -pylori and DN& isolation 

pylori strains were cultured in solid or liquid 
media for 3 days at 37 both in microaerophilic 

^_^OfPAere using Ovoid (Basingstoke, England) or Bcrton and 
25 Dickinson {CockeysviHe, hd) gay pack generators or in an 
inciibator containing air supplemented with 5% C02, (26) . The 
bacteria were harvested and resuspended in STE CNacl O.lM, 
Tris-HCl lOmM pH 8, EDTA 1 mM pH 8) containing lysozyme at 
a final concentration of 100 micrograms /ml and incubated at 
3 0 room temperature for 5 min. To lyse tho bacteria SDS was 
added to a final concentration 1% aiid heated at 65 «C. Alter 
the addition of proteinase K rit final concentration of 25 
microgramc/ml the solution was incubated aL 50» for 2 hours. 
The DNA was purified by CsCl grariiP.nt in the presence of 
35 ethidium bromide, precipitated with 77% ethanol and 
recovered with a sealed glass capillary. 

c. Construction and screening . of a lambda gtll expr«sis ion 
library 

To g^nprate the lambda gtll expression library, 
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genomic DNA from the CCUG 17874 strain partially digestad 
vith thu restriction enzymee HaeITT and Alul wac uccd. After 
fractionation on 0.8* agarocc gel, the DNA between 0.6 and 
a Kb in size was eluted using a.Costar Spi.n-X (0-23 micron) 
microcentrifuge filter. The products from each digestion 
were combined, and used to construct me expression library, 
using the lambda gtii cloning system kit (Bethesda Research 
Loboratoriea) dnd the GlgapacJc II Gold packagifxg kit 
(Rtratagene, La Jolla, OA), The library that contained 0.8-1 
X 10* recombinant phages was -ainplified in E, ooii yio8B, 
obtaining 150 ml of a lysate with a titer of 10* phageB/ml, 
85% or which were recombinant and had an average insert size 
of 900 base paire, . Immunological screfeninq was performed by 
standard procedures, using the Protoblot system (Promega, 
15 Madison, Wl) . 

d. Construction or plasaid libraries 

Attempts to make complete genomic libraries of 
partially digested chromosomal DNA, using standard vectors 
such as EMBL4 or lambda Dash encountered the dlf f icultiesi 
described also by many authors in cloning H. pvlay^ dna end 
failed to give satisfactory libraries. Therefore, .partial 
libraries were obtained using genomic DNA^from strains ccUG 
17874, G39 and G50 digested with the restriction enzyme 
Hindlll, cloned in_ the Bluescript SK+. DKA ligation, 
electroporation of coli DH lOD, screening, and library 
amplification have been performed. Libraries ranging from 
70000 to 85000 colonies with a background not exceeding the 
10% were obtained. 

e. DNA manipulation and nucleotide sequencing 
DNA manipulation was performed using standard 

procedurfts. DNA sequencing wao performed using Sequenase 2,0 
(USB) and the DNA fragments snown in Fig. 3 subcloned in 
Bluscript KS+. -Rar.h strand was ccquenced at l«iist three 
times. The region between nucleotides 1533 and 2289, for 
wUit;h a DNA Clone was not available, was amplified by PGR 
and sequenced using asymmetric PGR, and direct sequencing of 
amplified products. The ovorlapping of this region, was 
confirmed by one and double side anchored PCK: an eyternal 
univemal anchor ( b ' -GCAAGCTTATCGATCTCGACTCGAGCT-3 V 5 ' - 
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GACTCGAGTctiACATCGA-a • ) containing a protruding 5" Hiiidlii 
fioquoneo, and tlie recognition oit«3 of Clal, Gall, Xhol, was 
ligated to prlmex-extended DNA and amplified. A second round 
of PCR using nactod primerc. wac then used to obtain 
5 rragnents cf DNA suitable tor cloning and sequencing, mx 
seguaneo data vera aecemblad and analyzed with the GCG 
pacJtage (Genetics computer Group, Inc, Madison, wx) running 
on a VAX 3900 under VMS, Xhe GenBank and EHBL databases were 
examined using the embl VAXciuster. 
10 f. Protain preparation and ELISA — 

Protein extracts were obtained by treating 
pylori, -palletc with 6 M guanidine. Western blotting, SDS- 
PAGE. electroelution were perrormed by standard procedures. 
Fusion proteins verc induced and purified by electrocution 
15 or by ion exchange chromatography. Purj.fi ad prnti?ins wora 
used to iiniaiinizc rabbits and to coat microtiter plates for 
ELISA assays. Sera from people with normal mucosa, blood 
donors and paticntc were obtained from A* Ponzetto (Torino, 
Italy) Clinical diagnosis was based on histology of gastric 

2 0 biopcioe. Vacuolating activity of samples was tested on HeLa 

cells as described by cover et ai. infect. Tmrnun. 59:1264-70 

(1991). 

a. laanunodoninanco and cytotoxicity 

Wtsatern blots of H. nvlori guanidine extracts 
probed with sara from patientc with gastroduodenal disease 
showed that a protein of 130 )cDa that is a minor component 
lTi th^ Ccvomassia blua ctainod gol wac strongly recognized by 
all sera tested. The CAI protein was electroeluted and used 

3 0 to rais« a mouse serum that in a Weetem blot recognised 

only this protein. This serum was then used to detect by 
Western blotting the CAI protain in extracts of the 
pylori strains. The aiitigeii was present in the all 10 
strains tnat had vacuo! i?:ing activity on HeLa colic while it 
35 was abccnt in the eight strains that did not have such 
activity; in addition, the size of th^? protein varied 
slightly among the strains. The CAI antigen was not detected 
by wesL-«i-ri blotting in the other speciajs tested such as 
r^mpvlgbflctir ifti^ni . Helicobacter nuatcla^^.. E. coll . and 
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b. Structure or the cai gene 

10* clones of the lambda gtll expression library 
were screened using tlie mausa cerum specific for the CAI 
5 atrbigon and with a pool of sera rrom patients with 
yastroduodenal diseases. The sousa serum detectod positive 
clones at a frequency of 3 x 10"^, Sequence analysis or 8 
clones revealed that they vere all partially overlapping 
with olonc Ai shown in Pig, 3. The pool of human sera 
10 identified many clones containing different region!; of the 
cai qene, including clones 57/D, 64/4 and 24 and severaj 
'Clones overlapping clonp Ai. 

In Fig. 3, clones AI, 64/4, G5, A17. 24, and 57/D 
were obtained from the lambd;* gtll library. Clone Bl was 
15 obtained from a plasaid library of Hindlll fragments. 

.c.pli containing plasmlds 57/D, 64/4, Bl (B/1), and Pl-24 
(the latter aoat plasmid from nuclewLide 2130 to 2t^0) have 
been deposited with tha American Type cult\irc Collection 
(ATCC) , ceo below. 007 was obtainad by PCR, The open- 
20 readinq frame is shown at thA bottom of Fig. 3, Arrows 
indicate the position and direction of the synthetic 
oligonucleotides used as primers for sequencing, and the 
position of inccrtion of the repeated sequence' c»39 is 
shown. The nucleotide and amino acid eeguenco of one of the 
?.B repeated coquencec found in strain G39 ia also shown. The 
capital letters Indicate the sequences Dl, 02, said D3 
duplicated from the oai gene, the small letters indicate the 
nucleotide and amino acid linkerR, P«promotar, and 
terminator . 

3 0 The nucleotide sequence of the entire region was 

determined ucing the clones derived from the lambda gril 
library, the clone Bl Isolated from the Hindlll placmid 
library, and the fragment 007 that was obtained by PGR Of 
the chromosomal DNA. computer analysis of the S935 

3 5 nucleotide segupnce revealed a long open reading frame 
spanning nucleotidco 535 Lo 3577 that was in frame with the 
ruslon proteins deriving from the lambda gtll clones 04/4/ 
24 and AI and A17. Clone 57/D conLdined an open reading 
frame only In the 3 ' end of cloned fragment and therefore 
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uould not laaXa a gene ruslon wltli tha bAta galactocidaca 
g©n« of lajnbda g1:ll. The prcsance of an iauBunoreactiv« 
protein in tlie lajnlsaa grii clone 57/D could only bo 
63cplain«d by tho presence of an.cndog-enous promoter drivixiy 
the expression of a non fused protein. This hypothesis was 
proven to be true by cubeloning in both direction the Insert 
57/4 into the Bluescript plasmid vector and Rhrtwlng that an 
inanunoreactivo protein wao obtained in both cases. A 
couclusive evidence that the gene identified wasi indeed 
coding for the CAI antigen wac obtained by aubeloning the 
inserts &17 and 64/4 in the pEx plasmid vecrors to 
obtain fufiinn proteins that were purified and used to 
iMimize rabbits. The sera obtained, recognized specifically 
the CAT antigen band in cytotoxic R. pylori strains. 

The cai qene coded for a putative protein of 1147 
amino acids, with predicted aolccular weight of 128012.73 
Daltons- and an isoelectric point of 9.72, The basic 
properties of the purified protein were confirmed by two 
dimensional gel electrophoresis. The codon usagft and -the gc 
content (37*) of the gena were eimilar to that described for 
other H. pylori gem*« (13,26) • A putative ribosome binding 
sitp.: AGGAG^ was identifiod 5 baee pairs^ upstream from the 
proposed ATG starting codon. computer search for jiromoter 
sequences of the region upstream from the ATG start codon, 
identiried sequences resembling either -10 or - 35 regions, 
however, a region with good consensus to an E. coli 
promoter^ or resembling published H. -nvlnri' promoter 
sequences was not favmd. Primer extcncion analysis of 
purified H. pylori RKA showed that 104 and 214 base pairs 
upsrream from the ATG start codon - there are two 
transcriptional start sites. Canonical promoters could not 
De . identified upstream from either tranecriptional 
initiation sites. The expression of d portion of the CAI 
antiqen by clone b7/D suggests that E. coll^ is also 
recogni2ing a promoter in this region, however, it is not 
clear whether E. colt recognizes the fn?^^^^ proaoterc of H. 
CiOSlri or whether the H. pvlori DNA that is rich in A-T 
provides E* coll with regions mat may act «r promoterc. A 
rho ^nrtApendent terminator wac identified downstream from 
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thm fitop oodon. In Pig. 4, tOiti AGGAG rlDosome binding cito 
and t«nalnacor are underlined, and the repeated sequence and 
motif containing 6 asparaginea are boxed. The CM antigen 
wa« very hydropnilic, and did not show obviouc leader 
peptide or trancncnbrane sequences. Tlie most . HyOropiiiiic 
region Wd« from aanlno acids 600 to 900, whore alco a number 
of unusual features can be observed; the repetition of th« 
sequences EFKNGKNKDPSK and EPYIA, and the preeenc^ of a 
stretch of cix contiguous aspax-dyines (boxed in Fig. 4). 
c. Diversity of the eai gene 

Divercity of the gene appears to be generated by 
internal duplications. Tr> find out the meohaniaa of size 
heterogeneity of the CAI protein:* in dirxerent strains, tbe 
structure or one of th« strains with a larger CAI protein 
(G3S) vac analyzed using Southern blotting, PCR and DNA 
sequencing. The results showed that the cal gene of G2S and 
CCUC 17874 were identical in size until position 3406, where 
the G39 strain was found to contain an insertion of 204 base 
pairs, made by two identical repeats of 10?. base pairs. 
Each repeat was found to contain sequences deriving Irom the 
duplication of 3 segments of DNft (sequences Dl, D2 and in 
Fig. 3) coming from the same ragion of^the cai gene and 
connected by small linker sequences. A -srhomatic 
representation of the region where the ineertion occurred 
and of the insertion itself is shown in fig. 3. 
cai gene absent in noncytotoxic strains 

To investigate why the CAI antigen was absent in 
the noncytotoxic strains, nWA from two of them (G50 and 
G?.l). was digested with ecoRI, Hindlll and HaeTIi 
restriction enkiymes. and tested by Southern blotting using 
two probfi?; internal to the cai gene, spaiuiinq nucleotides 
520-1040 and 2850-4331 respectively. Both probes reoogniEed 
strongly hybridizing bands in strains CCUG 17874 and G39. 
The bands varied in size in the two strains, in agreement 
with the gene diversity. However, neither probe hybridized 
the G50 -and G21 DNA. This showed that th^ . noncytotoxic 
strains tested do not contain the ^^i gene, 
e. Serum antibodies 

The pr«s«nc» of serUTti antibodies against the CAi 
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antigen correlated with gastroduodenal diseases. To study 
tha quantitative smtibody response to the CAI antigen, the 
fusion protein produced ty the A17 rragment subclonad in 
pSae34 was purified to homogeneity and used to coat 
5 micro titer plates for an ELISA test. In tHis assay , me 
patients with gastroduodenal pathologiec had an average 
ELISA titer that was significantly higher rhan that found in 
randoaly selected blood donors and people with normal 
gdstrit; mucosa • To evaluate whether tne antibody titer 

10 -correlated with a particular gactroduodenal disease, the 
sera from patients with toiown histological diagnosis were 
tested in the ELISA assay. Patientc with duodenal ulcer had 
an average antibody titer slgniricantly higner than all the 
other diseases. Altogether, the ELISA was found to be aUe 

15 to predict 75.3% of the patients vith any gastroduodenal 
disease and 100^ of the patientc with duodenal ulcer. 

li) one particular ELISA, a recomfiinant protein 
containing 230 amino acids dariving from CAI antigen was 
identified by screening an expression library of H. nvTorl 

?.n DNA using an antiserum specific for the protein. The 
recombinant antigen was expressed as a fusion protein In E, 
coll . purified to homogeneity, and used to coat microtiter 
plates. The plates vet a then incubated for 90 minutes wirh 
a- -^1 Z^.nnn dilution - of goat anti-human IgG alkaline 

25 phosphatase co jugate* Following washing, the enzyme 
substrate was added to the plates and the optical deneity at 
405 na was read 30 minutes later- The cutoff level was 
detft-nn-rnftd by the mean absorbatnce plus two ctandard 
deviations, using sera from 20 individuals that had neither 

30 gastric diseaBA nor detectable anti- H. pvlori antibodicc in 
Western blotting. The ELISA assay was tested on the 
peripheral blood samples of p.ighty-two dyspeptic patientc 
(mean age 50.6±13.4 years, ranging from 28 to 80) undergoing 
routine upper gastrointestinal endoscopy examination. Tha 

35 gastric antral mucoca of patients was obtained for histology 
and Giemsa strain. Twenty of the patients haH duodenal 
ulcer, 5 had gastric ulcer, 43 had chronic active gastritis 
type B, a had Uuudenitiji and 6 had a normal histology of 
gastric munosa . All of the patients with duodenal ulcer had 
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an optical donsity valuo above the cutoff level- The 
patients with duodenitis, gastric ulcer, and chronic 
gastritis, had a positive ELISA value in 75%, 80* and 33,9% 
of the cases, respectively. The agreement betwPM ELISA and 
5 histological Gieasa staining wae 05^ in duodenal ulc&r, 98* 
in duodenitis, 80% in gastric ulcer and 55.8* in chronic 
gastritis. This assay givec an excellent correlation with 
duodenal ulcer disease (p<0.0G05J, 

iii^ Haat shock protgi^ f^^p) 

^0 k: Materials and methods 

a. K. pylori strains and growth conditions 

H.._PYlori strains used wAr*?: ccvo 17874, G39 and 
G33 (isolated from gastric biopsies in Llie hospital of 
Orosseto, ILaly) , Pylo 2U+ and Pylo 7V- (provided by F, 
15 Mftgraud, hospital Pellcgain, Bordeaux, Frauue) , ba96 
(isolated by gastric biopsies at the Univeraity of Siena, 
Ttaly) . strain Pylo 2U+ ic noncytotoxic? straiji Pylo 2V- is 
noncytotoxlc and ur ease-negative. All strains were 
routinely grown on Colunbia agar containing 0.2% or 
20 cyclodextrin, 5Mg/ml or cersulodin anrt S^g/ml of 
amphotericin B under microaerophilic conditions for 5-6 days 
at Cells were harvested and wasned with PBS. The 

pellets wp.re resuspended in Laenmili sample buffer, and lysed 
by boiling. _ . „ 

Sera of patients affected by gastritis ana ulcers 
(provided by A. Poniietto, hospital "Le Mo linette", Torino, 
Italy) and sera of patients with gastric carcinoma Cproviaed 
by F, Roviello, tJiiiversity of Siena, Italy) werft used. 

b. ImmunoscreRTiing of the library 
Five hundred thousand plaques of a Agtll H. nvleri 

DNA expression library were mixed with 5 ml of a suspension 
of -Q. co;i strain Y1090 grown O/N in LB with 0-2% Waltose 
and lOmH MgSO^, and resuspended in IObiM MgSO^ at 0.5 o.D. 
After 10 minutes incubation at 37 'C, 75 ml of melted 
3S TopAyarose were poured in the hacterial/phage mix and the 
whole was platod on Bbl plates (30,000 plaques /plate ) . 
After 3.5 hrs incubation of the plated library at 42«C, 
nitrocellulose filt«ro (Schleicher and 3chuell, Dassel, 
Germany), previously wet with lamM IPTG, were set on plates 
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and incubation was prolonged ror 3.5 hrs at 37*c and then 
0/N ar Lifted filtorc with lambda proteins were rlMe 

in PBS, and saturated in 5* nonrat dried milk dissolved in 
TBST Uom TRTS pK 8, loOaM MaCl, 5K KgCl,) for 20', THe 
fixct hybridisation step wa^i perrormea virh th« s«a of 
patients; to develop and visualiaa poeitive plaques we u^ed 
an anti human Ig antibody alkaline phosphatase conjugated 
CCappei, West Chester, PA) and the KBT/BCIP kit (Promega, 
Kadiaon, WI) in AP buTfer (lOOmM Tris pH 9.5, lOOnK KaCl,' 
5ffiM MgClj) according to the manufacturer instructions. 

c. Recombinant DNA procedures 

Reagents and ractriction cnzyaes used were from 
Sigma (St. Louis, MO) and Boehrlnger tMannh*^ia, Cemeny) . 
Standard techniques were used for molecular cloning, single- 
stranded DKX purification, transrormation In E. n^ii . 
radioactive labeling of probee, oolony screening of the K. 
EZlSSi DMA genomic library. Southern blot analysis, PACE and 
Western blot analysis. 

d. DNA sequence analysis 

The nWA fragments were cubcloncd in Blueycript SK+ 
(Stratagene, San Diego, CA) . Single-stranded DNA sequencing 
was performed by using [«P]<.dATP (Kev England Nuclear, 
Boeton, MA) and the Sequenase kit (U.S. Blachf=imi<=al Corp., 
Cleveland, oh) according to the manufacturer inst^ctione. 
The sequence was determineU in both strands and each strand 
was sequenced, on average, twice. Computer sequence 
analysis vac performed using the GCG package. 

e. Recombinant proteins 

MS2 polymeraae fusion proteins were produced using 
the vector pEaC34A, a derivative cf pEX31. Incert Hp67 (from 
nucleotide 445 to nucl««tide 140I> In rig. 5), and the F.r.oRI 
linkerii were cloned in fraTtiP. into the EcoRi oite of the 
vpm-.or. In order to confirm the licatiun of the stop codon, 
the npG3' Kijidlli rragment was cllnned in frame into the 
Hlnaill site of PEX34A. Recombiiiant plasmids were 
transformed in colj K12 :hi Atrt. Tn both cases after 
induction, a tusion protain of the ejected molecular weight " 
was produced. In the case of the EdOKl/EcoRT fragment, the 
fusion protain oh-hain after : "^--^ was clectroeluted to 
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imTTmnize rabbits ueing ctandard protocols. 
1j — Reaults 

a. <rcr««ning of an expression libroxy and clonlJig ot 
pylori hsp 

In ord«r to find q serxin suitable tor the 
screening of an E._EXXorl DNA expression library, sonicated 
ertracts of pylorj ctrain CCUG 17874 were testea in 
Western blot analysis against sera of patients affected by 
dlff«.rent forms of gactritis. The pattern of anrlgen 
recognition by different sera was variable, probably due to 
differences in the individual immune response as- well as to 
the differences in the antigens pvprasscd by the otrains 
Involved in the infection. 

aeruiu W19 was selected to screen a Agtn 
EZlori DNA expression library to identify H, ovlor l specif ic 
antigens, expressed in vivo during bacterial growth. 
Following screening of the library with thiii serum, many 
positive clones were isolated and characterized. The 
nucleotide soquence of one of these, called Hp67, revealed 
an open-reading frame of 958 hase-pairs, coding for a 
protein with high homology to the hspeo family of hea,t-shoclc 
proteins, Zllis, Nature 358:191-92 (issp) . in order to 
obtain the entire coding region, we Ui#»d fragment- hp 67 as a 
probe on Southern tolot analysis of H. pvlo^i dna digested 
2b with different restriction enzymes. Probe Hp67 recognized 
two Hlndlli bands of approximately 800 and 1000 base-pairs, 
respectivaly. A genomic H. pylprj library of Hinflllil 
digested DNA was* screened with probe Hpfi7 and two positive 
clones (KpGSi and HpC3') of the expected molecular weight 
30 were obtained, — coli containing plasmids pHp60G2 
(approximately nucleotidac i to 829) and pKpeoGS 
(approximately nucleotides 624 to 1H38) ver^* deposited with 
the American Type OiTture Collection (ATCC) . 
b. Sequence analysis 

nucleotide sequence analysis revealed an open- 
reading frana of 1638 base-pairs.,^ with a putative ribOKnme 
binding site 6 base-pairs upstream the starting XTG. Fig. 
R shows the nucleotide and amino acid sequences of H. pylr>rt 
hsp. The putative riboaome-bindijig and the internal Ilindlll 
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aite axti uiid«r lined. Cytosiue in position 44b and guanina 
in position 1402 ar* tha firct and lact nuclaotid*, 
raspactively, in fragment Hp67, Thymine 1772 was Identified 
as th« last putativa nuclaotida transcribed ucing an 
5 algorithm for the localization of factor-independent 
temlnator regions. Th« open-roading framo oncodod for a 
protein of 546 amino acids, vith a predicted molecular 
weight of 58,3 THa and a predicted pi of 5.37. The codon 
preference of this g-ene is in agreement with the H> pvlori 
lu codon usage. 

The analysis of the hydxophylicity profiles 
revealed a protAin mnstly hydrophilic, without a predicted 
leader peptide or other transmembrane domains. The amino 
tormina 1 saqiipnce showed. 10 homology to the soguonce o£ 30 
IS amino acids deterained by Dunn et al. , Infect, immun. 
60:194-6-51 (5*595) on the purified protain and differed by 
only on reside (3er42 instead or Lya) from the sequence of 
A4 amino acids published by Evans et al. Infect, Imauh, 
60:2125-27 (1992). (Evans et al., 1992)- The N-termlnal 
20 sftgii^nc* of th^ mature hsp protein did not oontain the 
starting methioniniiiy indit;ating that this had been removed 
after translation- 

c. Homalogy with hspCO family 

Tne amino, acid RAtjiif^nce analysis showed a vory 
as strong homology with the family of heat-sUocK proteins 
hspeo, whose members are present in every living organism. 
Based on the degree of homology between hsp60 proteins of 
different species, H. nvlori hsp hp.lnngs to the subgroup of 
hsp60 proteins of Gram negative bacteria; however, the 
3 0 degree or homology to tne other proteins of thft hspfiO family 
ic very high (at least 54V identity) . 

d. Expression of recombinant proteins and production of a 
polyclonal antiserum 

The inserts of clone Hp67 and of clone HpG3 ' werci 
35 subcloned in the exprassion vector pIX34A in order to 
express these opeii-rfeading frames fused to tue aminoterminus 
of tbft MS?, polymerase. The clones produced recombinant 
proteins of the expected si^tt axid were recognized by tne 
auaan aerum u«eri for the initial screening- The fused 
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protein derived fron clone Hp67 va, electro*luUed and usea 
to ifflmunize r<ibbits in order to obtain anti-hsp cpocific 
polyclonal antisora. The antiserum obtained recognized both 
fusion protein., and a proteih of 58 KDa on whoXe-cell 
evtr.cts of several strains of H^^EZlSzi tested, including 
a ureaae-negative strain and noncytotoxic strains. 

Hsp has been ehown to be expressed by all the H. 
mlojii strains tested and its expression is not assbciated 
with the presence of the ureaoo or with the cytotoxicity. 
The protein recognized by the anti-hep antiserum was found 
in the water soluble extracts of H. i^vlo^j and copurified 
with the urease suhunitS. This sugg^^sts a weaJc accociation 
of this protein with the outer bacterial mejnbrane. Thus, 
hap can be described as urease-associated and surface 
exposed. The cellular surface localization is surprising as 
most of the hsp homologous proteins are localized in the 
cytoplasm or in mitochondria and plastids. Tlie absence of 
a leader peptide in hsp suggects that this ic cither 
exported to the membrane by a peculiar export system, or 
that the protein is released from the ryi-oplasa and ie 
passively ;*d9orbed by the bacterial membrane after death of 
the bacteriua- 

Hsp60 proteins have been shown to act as molecular 
chaperons assisting the correct folding, asso^ly and 
translocation of either oligomcric or multimerie proteins. 
The cellular loc<ilization or h. nvm-ri hc;p and its weak 
association with urease suggeet that hep aay play a role in 
acsiating the folding and/or assembly of protelnR exposed on 
the membrane surfaca and composed of multiple subunits such 
ae the urease, whose final quaternary structure is A b . 
Austin et ai. , J. Bacterid, 17/i:7470-73 (1992) showed that 
the H. pylori hsp ultra structure is composed of seven 
subunits assembled in a disk-Ghaped particle that further 
stack aide by side in groups or four. This structure 
35 resembles the shape and dimension of the urease 
macromoleculQ and this could explain the common properties 
of these two macromolecuies that, lead to their 
copurification. ffylo^ hnp gene, however, is not part of 
the urease operon. In agreement with the gene structure of 
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otner toacrarlal hspeo protalns:, it should ba part of a 
dicifitronic opcron. 

a. Presence or anti-lisp antibodies in patiwitc with 
gastroduodenal dicoacee 
5 The purified fusion protein was to-sted by Wactarn 

blot using ssora of patients infected by H, ovTm-i and 
arrected by atrophic and superficial gastritis, and patientc 
with duodanal and gastric ulcers: aost of the sera 
recognized the recombinant protein. HowAver, the dogroe of 
10 rAcognition greatly varied between different individuals and 
• the antibody levels did not show any obvious correlation 
with thA typ^ of disease. In addition, antibodies against 
n. pyloyi antigens and in particular against hsp protein 
werA found in most of the 12 -sera of patients affected by 
gastric carcinoma that were tested. Although H. pylori hsp 
recognition could not be put in relation with a particular 
clinical state of the disease given the high conservation 
betwAAH PYlgg i hsp and itc human hoaolog, it is possible 
that this protein inciy induce autoiinmune antibodies cross- 
20 reacting vith the human counterpart. This class of 
homologous proteins has been implicated in the induction of 
autoimmune di<.ord*rB in different cyotcnp. Thenpresent of 
high titers of anti-H. pyjqi: hsp antibodies, potfint.ially 
cross-reacting with the human homology in dispcptic patients, 
25 ouggcsts that this pi:oteiii has a role in gastroduodenal 
disease- This an tnre activity could play a role in the 
tissue damage that occurs in IL—BiLlori- induced gastritis, 
thus increasing the pathogenic mechaniemc involved in the 
infection of this bacterium. 

'^'he high levRls of antibodies against auch a 
conserved protein is somewhat ujiusual; due to the high 
homology between members of the hspeo family, including the 
human one, thic protein should be vuxry well toleratea by the 
host immune system, Tne strong i immune responso obccrved in 
3 5 many patients may be explained in tvo different ways: (1) 
the immune response is directed nnly against epitopes 
specific for P. ffy;ori hsp; (2) the- immune response is 
directed against epxtooes Which are in cmnmon between 
Dvior-j hsp and human homo log. 
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IL — DePOalt at H^r>1nrr<cal Ka*;^-^t.-,^ 

The following aatarlals were deposltftd on D«e«ab«r 
15. 1392 ana January 22, 1993 by Biocinc Sclavo, S.p.A., the 
assignas of tbc present invention, with the American Typa 
culture collection (ATCC) , I230i ParJclawn Drive, RoclcviaiB, 
Maryland, phone (3oi) 231-5319, under the terms of the 
Budapest Treaty on the International Recognition of the 
Deposit of Microorganisms for Purposes of Patent Procedure, 
ror the cytotojtin protein (CT) j 

ATCC No. 691S7 _E, eolj TGI containing the .plasm id TOXHHl 
ATCC NO. n/a £■ "em i , rci containing the piesmid TOXEEl 
ror the CAI protein: 

ATCC NO. 69158 fL_G2U TGI containing the plasmid 57/D 
ATCC No. 60159 E. opii, TGI containing the plasmid 64/4 
ATCC No. 69160 g. g^lj. TCI containing the plasmid Pl-24 
ATCC No. 69161 p. coU TGI containing the plasmid B/l Por 
the heat shock protsin (hsp) : 

ATCC No. 69155 E. colj TGI containing the plasmid pHJ,60G2 
ATCC NO. 69156 npli , TGl containing the plasmid pHpeos 

Theae depcsits are provided as eonvenienca to 
those or sKill in th« art, and are not an admiasion that a 
deposit ic required under 35 U.S.C. Sll2.^ The nucleic acid 
sequences or these deposits, as well as the amino acid 
seguoncaa of the polypeptides encodeo thereby, are 
incorporated herein by refAr*nce and should be referred Lo 
in the event of any error in tii« sequences described herein 
as cuopared with the sequences of the deposits. A license 
way be required to aake, use, or sell the deposited 
materials, and no such license is granted horeby. 
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Wha'b is claimad ie: 

1. X recombinant Relicobaotcr pylori^ protein, or 
5 a derivcLtivtt or fragment thereof. 

2. Tlie recombinant protein according to claim 1 
wherain tha protain ie a Helioobaeter pyloyl cytotoxin or a 
precursor, derivative or fragment thereot*. 

10 

3. The recombinant protein according to claim 2 
wherein tha oytotoKin, precurcor, derivative or fragment 
thereof has the amino acid seqiience of Figure or a 
portion thoraof , 

15 

4. The recombinant protein according to claim l 
wherein tii«s protein is a H elicobacter pviori cytotoxin 
associated immunodominant antigen^ or a derivative or 
rragment thereof. 

20 

5- The recombinant protein according to claim 4 
wherein the cytotoxin associated immunodominant antigen, 
derivative or fragment has the amino acid secjuence p£ Figure 
4, or a portion thereof. 

25 

6. Thft recombinant protein according to claim 1 
wherein the protein is a Helicobac ter pylori heat shock 
protein, or Pt riP.rivative or fragment thereof. 

7. The rAcombinant protein according to claim 6, 
wherein the heat ahock protein, derivaUive or rragiaent has 
the amino acid sequence of Figure 5 or a portion thereof. 

B. The recombinanr protein according to claim 2 
35 or 3 wherein the recombinant protein exhibits substantially 
iiy toxicity, or substantially reduced tnxicity. 

9. The recombinant protein according to any one 
of c7«l-mR 4 to 7 wherein the recombinant proteii'i is 
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immunogenic and exhibits no runctional contribution to 
toxicity, or a suh.c;t.antially reduced functional contribution 
to toxicity. 

5 10, The recombinant protein according to claim 8 

or 9 wherein the recombinant protein is chemically modified 
to reduce or abolish toxicity or functional contribution to 
toxicity. 

11- The recombinant protein according to olaia 8 
or 9 wherein the recombinant protein contains one or more 
amino acid substitutions or deletions. 

12. The recombinant protein according to any one 
15 of the preceding claims which is labelled or coupled to a 

solid support. 

13 . The recombinant protein according to any one 
of claims l to 11 for use in the treatment or Helleoharrpi- 

20 pylori inrection. 

14. The recombin;*nt protein according to any one 
of claims 1 to ii for use as a vaccine. 

'^'^ ^ vaccine or therapeutic composition 

comprising a recombinant protein according - to any one of 
claims 1 to 11 and a pharmaceutically acceptable carrier. 



^° The vaccine or therapeutic compoeition 

according to claim IS comprising two or more recombinant 
proteins according to any one claims 1 to ii. 



17. The vat;(;;ine Or therapeutic coTnposition 
according to claim 3f comprising, in combination, two or 
mora of 

i) a recombinant. HalicobactQr ' py^"^? cytotoxic 
protQin procurcor, derivative or lidgment thereot. 

^ HfiUcobart.Ar pylozi recombinant cytotoxin 
.QtlRClTITUTE SHEET 
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assoclatiecl inunimodoTninant antigen, or a deriva-tiva or 
fragment thereof, 

iii) HelicobaetP.r pylori recombinant heat choclc 
protein or a derivative or fragment thereof and/oi; 
5 IV) a Helicobacrer wTnri uraase. 

18. THe vaccine or therapeutic compnsitlon 
according to any one of claims 15 to 17 comprising an 
adjuvant - 

10 

19. A method for me preparation of a vaccine or 
therapeutic composition acoording to any one of claims 10 or 
18 comprising bringing one or more recombinant proteins 
according to any one of claims 1 to 11 into association with 

15 a phanaaceutically acceptable carrier and optionally an 
adjuvant, 

20. An immunodiagnostic aaaay comprising at least 
one srep involving as at least onA binding partner, a 

20 recombinant protein according to any one of claims 1 to 12, 
optionally labelled or coupled to a solid support- 

21- An immunodiagnosis Xit for perfojTning an 
assay according to claim 20, comprising at least one 
20 recoml;ijiant protein according to any one ot claims 1 to 0.0. 

22. Use or one or more recombinant protftiTis 
according to any one of claim© 1 to 11 for the manufacture 
of a medicament for the treatment of Helicobaerei- ovioTi 

30 infection, 

23 . A method nf treatment of an individual 
infected with Helicobacter tjvlori compiitfing administering 
an ertective amount or a recombinant protein according to i 

35 to 11. 

21. The method of treatment according to claim 23 
cuinprising administering an ertective :*mount of, in 
combination, tvo or more of 
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i) a recomipiiiiuit Hel lcoBaerpr r.y^>^^.' cytotoxic 
protein precursor, derivative or fragment thereof, 

ii) a nel;c9{jgctPr py}nri. recombinant cytotostin 
associated inuaunodoTnincmt antigon, or a derivative or 
fragment thereof 

^^^^ ^ Pel^gft^^rt•or rvlnrj recombinant heat shocX 
protein or a derivative or fragment thereof and/or 
iV) a H^Hrnha rtcr py lftr-i urease. 

^ niethod of vaccination comprising 
administering on immunologically efrective amount of, in 
combination, two more of 

i) a recombinant Helicobacter nvir>^.- cytotoxic 
protein precursor, derivativo or fragment thereof, 

^ ^^^Ti^oharfcr pylnrj recombinant cytotoxin 
accooiated immunodomiuanL antigen, or a derivative or 
fragment mereof, 

« HgJicobactP.r pylorj recombinant heat shock 
protein or a dP.rlvativc or fragment thereof and/or 
iv) a Helicoba ctei- pvTot-i urease. 

26. A recombinant polynucleotide, encoding 3 
recombinant protein according to any ono of claims 1 to 11. 

27. A recombinant polynucleotide encoding a 
Helicobacter pylori cytotoxic protein or a derivative or 
fragment thereof comprising all or part of the nucleotide 
sequence of Figure 1. 



28. A recombinant polynucleotide encoding a 

Helicobflg^eT: PvXoxi recombinant rylrotoxin aecociated 

immunodominant antigen or a derivative or fragment thereof 
oomprising all or a part or ,the nucleotide sequence of 
Figure 4. 



29. 



A recombinant polynucleotide encoding a 
tf^l?,9ot?ac1;gr PYloH i-ecombinant near snoc)c protein or a 
derivative or tragment thereof compricing all or a part of 
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the nucleotide sequence of Figure 5. 

30. A polynucleotide probe comprising all or part 
of the rccoabinant polynuolcotidc aooording to any one of 

i? claims ^6 to 29, 

31. A nucleic acia assay wnerein in at least one 
step involves a polynucleotide probe according to claim 30, 

32. A kit for performing a nucleic acid assay 
comprising at least one poiynucieotide probe according to 
claim 30. 

33. A polynucleotide amplification process 
15 employing a polynucleotide primer wnerein in at least one 

primer ic a recombinant polynucleotide comprising all or 
part or tne recombinant polynucleotide according to. any one 
of claims 26 to 29. 

34. A kit for performing a polynucleotide 
ampliricatlon process employing a polynucleotide primer 
wherein in at leact one primer is a recombinant 
polynucleotide comprising all or part of the nsQombinant 
polynucleotide according to any one of claims 26 to 29. 

25 • 

35. A vector comprising a recombinant 
polynucleotide according to any one of claims 26 to 79. 

36. A host ceil transformed with a vector 
30 according to claim 35, 

37. A m«thod for th« production of a rocombinant 
polypeptide according tu cmy one of claims 1 to ix, 
comprissang cnilturing a hof^t c^ll according to claim 36 and 

35 isolating the recombinant polypeptide. 
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1 AAAAAGAAAG GAAGAAAATG GAAATACAAC AAACACACCG CAAAATCAAT 
51 CGCCCTCTGC TTTCTCTCGC TTTAGTAGGA GCATTAGTCA QCATCACACC 
101 GCAACAAA6T CATGCCGCCT TTTTCACAAC CGTGATCATT CCAGCCAHG 
151 nSGGGGTAT CGCTACAGGC ACCGCTGTAG GAACGGTCTC AGGGCHCTT 
201 AGCTGGGGGC TCAAACAAGC CGAAGAAGCC AATAAAACCC CAGATAAACC 
251 CGATAAA6TT TCiGCGCAnC AAGCAGGAAA AGGCTTTAAT GflATTCCCTA 
301 ACAAGGAATA CGACTTATAC AGATCCCTTT TATCCAGTAA GATTGATGGA 
351 GGTTGGGATT GGGG6AAT6C C6CTA6GCAT TAHGGGTCA AAG6CGGGCA 
m ACAGAATAAG CHGAAGTGG ATATGAAAGA C6CT6TAGGG ACTTATACCT 
«51 TATCAGGGCT TAGAAACTTT ACTGGTGGGG ATTTAGATGT CAATATGCAA 
501 AAAGCCACn TACGCnCGG CCAATTCAAT GGCAAnCTT HACAAGCTA 
551 TAA65ATAGT GCTGATC6CA CCACGAGAGT 6ATTTCAACG CTAAAAATAT 
501 CTCAATT6AT AATTTTGCAG AAATCAACAA CTCGTGTGGG TTCTGGAGCC 
651 GGGAGGAAAG CCAGCTCTAC GGTTTTGACT HGCAAGCTT CAGAA6G6AT 
701 CACTAGCGAT AAAAACQCTG AAATTTCTCT TTATGATG6T GCCACGCTCA 
751 ATTTG6CTTC AAGCAGCGTT AAAHAATGG GTA/TTGTGTG GATGGGCCGT 
801 TTGCAATACG TGGGAGCGTA TTrGGCCCCT TCATACAGCA CGATAAACAC 
851 nCAAAAGTA ACA6GGGAAG TGAATnTAA CCACCTCACT GHGGCGATA 
901 AAAACGtCGC TCAAGCG66C AHATCGCTA ATAAAAAGAC TAATAnGGC 
951 ACACTGGATT TGTGGCAAAG CGCCGGGHA AACAHATCG- CTCCTCCAGA 
1001 A6GTGGCTAT AAGGATAAAC CCAATAATAC CCCTTCTCAA AGTGGTGGTA 
1051 AAAACGACAA AAATGAAAGC GCTAAAAACG ACAAACAAGA GAGCAGTCAA 
1101 AATAATAGTA ACACTCAGGT CAHAACCCA CCCAATAGTG C6CAAAAAAC 
1151 A6AAGTTCAA CCCACGCAAG TCATTGATGG GCCTTTTGCG G6CG6CAAA6 
1201 ACACGGHGT CAATATCAAC CGCATCAACA CTAACGCTGA TGGCACGATT 
1251 AGAGTGGGAG GGTTTAAAGC nCTCHACC ACCAATGCGG CTCATTT6CA 
1301 TATCGGCAAA GGCGGTGTCA ATCTGTCCAA TCAAGGGAfiC GGGCGCTCTC 

FIG. 1A 



WO93/1W50 



PCT/EP93/00472 



2/U • 

1351 HATAGTGGA AAATCTAACT GGGAATATCA CCGTTGATGG GCCTTTAA6A 
1101 GTGAATAATC AAGT6GGTGG CTATGCTTTG GCAGGATCAA GCGCGAATTT 
1151 TGAGTTTAAG 6CT66TACGG ATACCAAAAA CG6CACAGCC ACTTTTAATA 
1501 ACGATATTAG TCTGGGAAGA nTGTGAATT TAAAGGT66A TGCTCATACA 
1551 GCTAATnTA AAGGTAHGA TACGGGTAAT GGTGGTTTCA ACACCHAGA 
ICOl TTTTA6T6GC GHACAGACA AAGTCAATAT CAACAA6CTC AHACGGCH 
TfiSl CCACTAATGT G6CCGTTAAA AACHCAACA HAATGAAH GATTGTTAAA 
1701 ACCAATGGGA TAAGTGTGGG GGAATATACT CATITTAGCG AAGATATAGG 
1751 CAGTCAATCG CGCATCAATA CCGTGCGTrT 56AAACTGGC ACTAGGTCAC 
1801 TTTTCTCTGG GGGTGTTAAA TTTAAAGGI6 GCGAAAAAII' GGI lAIAGAI 
1851 GAGnmCT ATAGCCCTTG GAAnATTTT GACCCTACAA ATATTAAAAA 
1901 TGnGAAATC ACCAATAAAC nGCTTiTGG ACCTCAAGGA A6TCCTTGGG 
1951 GCACATCAAA ACTTATGTTC AATAATCTAA CCCTA66TCA AAATGCGGTC 
2001 ATGGAHATA GCCAATTTTT AAATHAACC AHCAAGGGG ATTTCATeAA 
2051 CAATCAAGGC ACTATCAACT ATCT6GTCCG AGGTGGGAAA GT6GCAACCT 
2101 TAAGC6TAGG CAAT6CA6CA GCTATGAT6T TTAATAATGA TATAGACA6C 
2151 GCGACCGGAT TTTACAAACC GCTCATCAAG AHAACAGCG 'CTCAAGATCT 
2201 CAHAAAAAT ACA6AACATG nTTATTGAA AGCGAAAATG ATTGCnATG 
2251 GTAATGTTTC TACA6GTACC AAT&GCATTA GTAATGTTAA TCTAGAAGAG 
2301 CAATTCAAA6 AGCGCCTAGC CCTTTATAAC AACAATAACC GCAT6GATAC 
2351 HGTGTGGTG CGAAATACTG ATGACAHAA AGCATGCGGT ATGGCTATC& 
2101 GCGATCAAAG CATGGTGAAC AACCCTGACA ATTACAAGIA ICTTATCGGT 
2151 AAGGCATGGA AAAATATAGG GATCAGCAAA ACAGCTAATG GCTCTAAAAT 
2501 nCGGTGTAT TAHTAGGCA AHCTACGCC TACTGAGAAT GGTGGCAATA 
2551 CCACAAATTT ACCCACAAAC ACCACTAGCA ATGCACGTTC TGCCAACAAC 
2501 GCCCHGCAC AAAACGCTCC TTTCGCTCAA CCTAGT6CTA CTCCTAATTT 
2651 AGTC6CTATC AATCAGCATG ATTTTGGCAC TAHGAAAGC QT6TTT6AAT 

FIG. 1B 
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?70] TGGCTAACCG CTCTAAAGAT AHGACACGC TTTATGCTAA CTCAGGCGCT 
2/51 CAAGGCAGGG ATCTCTTACA AACCTTAHG ATTGATAGCC ATGATGCGGG 
2801 TTATGCCAGA AAAATGAHG ATGCTACAAG C6CTAATGAA ATCACCAAGC 
2851 AATTRAATAC GGCCACTACC ACTTTAAACA ACATAGCCAG TIIAGAGCAI 
2901 AAAACCAGCG GCTTACAAAC TTTGAGCTTG AGTAAT6CGA TCATTTTAAA 
2951 nCTCGTTTA GTCAATCTCT CCAGGAGACA CACCAACCAT ATTGACTC6T 
3001 TC6CCAAACG CHACAAGCT HAAAAGACC AAAAATTCGC HCTTTAGAA 
3U51 AGCGCGGCAG AA6TGTTGTA TCAATTTGCC CCTAAATATG AAAAACCTAC 
3101 CAATGTTTG6 6CTAAC6CTA HGGGGGAAC 6AGCTTGAAT AATGGCTCTA 
3151 AC6CTTCATT GTATGGCACA AGCGCGQ6CG TAGACGCHA. CCTTAACGGG- 
3201 CAAGTGGAAG CCATTGTGGG CGGTTTTGGA AGCTAIGGH ATAGCTCTTT 
3251 TAATAATCGT GCGAACTCCC HAACTCTGe Q6CCAATAAC ACTAATTTTG 
3301 GC6T6TATAG CCGTATTTTA ACCAACCAGC ATGAATTTGA CTTT6AAGCT 
3251 CAAGG6GCAC TAGGGAGCGA TCAATCAAGC nGAATTTCA AAAGCGCTCT 
31*01 AHACAAGAT TTGAATCAAA GCTATCAHA CTXAGCCTAT AGCGCTGCAA 
3^151 CAAGAGC6A6 CTATRRTTAT GACTTCGCGT TTTTTAGGAA CGCTTIAGTG 
3501 TTAAAACCAA GCGTGGGTGT GAGCTATAAC CATTTAGGn CAACCAACTT 
3551 TAAAAGCAAC AGCACCAATC AAGTGGCITT 6AAAAATGGC TCTA6CA6TC 
3601 AGCATTTAn CAACGCTAGC GCTAAT6TG6 AAGCGCGCTA nATTATGGG 
3551 GACACTICAT ACTTCTACAT GAAT6CTGGA GTinACAAG AGHCGCTCA 
3701 TGTTGGCTCT AATAACGCCG CGTCTTTAAA CACCTTTAAA GTGAATGCCG 
3751 CTCGCAACCC TTTAAATACC CATGCCAGAG T6ATGAT6GG TGGGGAATTA 
3801 AAATl A6CTA AAGAAGT6TT TTTGAATTTG GGCGnGTTT ATTTGCACAA 
3851 TnGAnTCC AATATAGGCC ATTTCGCTTC CAATHAGGA ATGAG6TATA 
3901 6TTTCTAAAT Ar.r.GCTCTTA AACCCATGCT CAAAGCATGG GTTTGAAATC 
3951 HACAAAACA 

FIG. 1C 
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1 HEIQQTHRKI NRPLVSLALV GALVSITPQQ SHAAFFTTVI IPAIVGGIAT 
51 GTAVGTVSGL LSWGLKOAEE ANKTPDKPDK VWRI0ASK6F NEFPNKEYni 
101 YRSLLSSKID GGWDWGNAAR HYWVKGGQQN KLEVDMKDAV GTYTLSGLRN 
151 FTGGDLDVNM QKATLRLGQF NGNSFTSYKD SADRTTRVIS TLKISQLIIL 
201 OKSTTRVGSG A6RKASSTVI TI.QASEGITS BKNAFISLYD 6ATLNLASSS 
2bl VKLMGNVWHG RLQYVGAYLA PSYSTINTSK VTGEVNFNHL TVGDKNAAQA 
301 GIIANKKTNI GTLDLWQSA6 LNIIAPPEG6 YKDKPNNTPS QS6AKNDKNE 
351 SAKNDKQESS QNNSNTQVIN PPNSAQKTEV OPTQVIDGPF A6GKDTVVNI 
'101 NRINTNADGT IRVGGFKASL HNAAHLHIG KGGVNLSNQA SGRSLIVENL 
^51 T6NITVD6PL RVNNQV66YA LAGSSANFEF KA6TDTKNGT ATFNNDISL6 
501 RFVNLKVDAH TANFK6IDTG N6GFNTLDFS GVTDKVNINK LITASTNVAV 
551 KNFNINELIV KTNGISVGEY THFSEDIGSQ SRINTVRLET GTRSLFS6GV 
501 KFKGGEKLVI DCPYYSPWNY FDARNIKNVC ITNKLAFGPQ GSPW6TSKLH 
651 FNNLTI fiONA VMmOFI Nl TIQGDFINNO 6TINYLVRGG KVATLSVG^fA• 
701 AAMMFNNDID SATGFYKPLI KINSAQDLIK NTEHVLLKAK IIGYGNVSTG 
751 TNGISNVNLC EOFKERLALY NNNNRHDTCV VRNTDDIKAC .GMAIGDQSMV 
801 NNPDNYKYLI GKAWKNIGIS KTANGSKISV YYLGNSTPTE NGGNHNLPT 
851 NHSNARSAN NALAQNAPFA QPSATPNLVA INQHDF6TIE SVFELANRSK 
901 DIDTLYANS6 AQGRDLLQTL LIDSHDA6YA RKHIDATSAN EITKQLNTAT 
95] TTINNIASLE HKTSGLOTLS LSNAMILNSR LVNLSRRHTN HIDSFAKRLQ 
1001 ALKDQKFASL ESAAEVLYQF APKYEKPTNV WANAIGGTSL NNGSNASLYG 
1051 TSAGVDAYLN GQVEAIVGGF GSYGYSSFNN RANSLNSGAN NTNFGVYSRI 
1101 LTNQHEFDFE A0GAL6SDQS SLNFKSALLQ DLNQSYHYLA YSAATRASYG 
1151 YDFAFFRNAL VLKPSVGVSY NHLGSTNFKS NSTNQVALKN GSSSQHLFNA 
1201 SANVEARYYY GDTSYPYMNA GVLQCFAHVG SNNAASLNTF KVNAARNPU 
1251 THARVMMGGE LKLAKEVFLN LfiWYLHNI T SNIfiHFASNL 6MRYSF 

FIQ. 2 
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CTCCATmAAGCAACTCCATA6ACCACTAAA6AAACTTTTTnGA6GCTArCTTTGAAA 
GCTTAATTATACATGCTATAGTAAGCATGACACACAAACCAAACTATnTTAGAACCCn 
TCAAAAA GAnCATncnATrTCnGnCTTAnAAAGnCTTTCATnTAGCAAATTT 
CrmTTCAATATTAATAATGAnAATGAAAAAAAAAAAAAATGCTTGATATTGnGTAT 
nGACACTAACAA6ATACC6ATA6GTATGAAACTA6GTATAGTA4fiMfiAAACAAT6ACT 

M T 

AATAATCTTCAAGTAGCTTncnAAAGnGATAACGCTGTCGCTTCATACGATCCTGAT 
23NNLQVAFLKVDNAVASYDPD 

CAAnAAGGGAAGAATACTCCAATAAAGCGATCAAAAATCCTACCAAAAAGAATCAGTAT 
63QLREEYSNKA1KNPTKKN0Y 

GAATGTTCCACAAAGAGCTnCAGAAATTTGGGGATCflGCGnACnRAATTTTCACAAGT 
103ESSTKSFQKFGDQRYRI FTS 

GAAAATATCATACAACCCCCTATCCTT6ATGATAAA6AGAAA6CS6AQTTTTT6AAATCT 
143ENIIQPPILDDKEKAEFLKS 

ATGGGCGTGTTTGATGAGTCCTTGAAAGAAAGGCAAGAAGCAGAAAAAAATGGAGAGCCT 
183MGVFDESLKERQE AEKNGEP 

■ GATGTCAAAGAAGCAATCAATCAAGAACCACnCCCCATGTCCAACCAGATATAGCCACT 
223 D V K EAINQEPVPHVQPD IAT 

AATTnTCTAAATTCACTCTT66CGATATQGAAAT6nAGAT5TTGA6GGAGTCGCTGAr. 
263 N F v; K h I L G D H E M L D V E 6' V A I) 

.nAATGGGGACTCATAATGGCATAGAACCTGAAAAAGTTTCATTGTTGTATGGGQGCAAT 
303 L M 6 S H N G I E P E K V S L I. Y 6 G N 

AACAATGTGGCTACAATAATTAATGTGCATATGAAAAACGGCAGTGGCnAGTCATAGCA 
3A3.N NVATI INVHMKNGS6LVIA 

GGCTCACAACGAGCATTAAGTCAAGAAGAGATCCAAAACAAAATAGATTTCA TGGAATTT 
383 G S Q R A L S Q E E I 0 N K I D F H E F 

ACTGAGAnAAAGAnTCCAAAAAGACTCTAAGGCTTATTTAGACGCCCTAGGGAATGAT 
^23 T E I K 0 F Q K U S K A Y L D A L G N D 

AATGGGGATTTGAGCTACACTCTCAAAGATTATGGGAAAAAAGCAGATAAAGCnTAGAT 
'463NGDLSYTLKDYGKKAnKALD 

TATTCTAATTTCAAAIACftCCAACGCCTCCAAGAATCCCAATAAGGGTGTAGGCGTTACG 



FIG. 4 A 
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ATCTG I CCl ATTGATTI ti 1 1 TTCCATTTTGTTTCCCATGTGGATCTTGTGGATCflCAAAC 120 
CATCTGCTCACCnGACTAACCATTTCTCCAACCATACTTTAGCGnGCATTTGAnTCT 2*10 
TTGTTAATTGTGGGTAAAAATGTGAATC6TCCTAGCCTrTA6ACSCCT6CAACGAtCS66 360 
AATGAGAATGTTCAAAGACATGAATTGACTACTCAAGC6TGTAGCGAI I n i AGCAGTCI 480 
"AACGAAACCATTGACCAACAACCACAAACCiSAAGCGGCTnTAACCCGCAGCAATTTATC 600 
NETIDQQPOTEAAFNPOOFI 
CAAAAACCAATCGnGATAAGAACGATAGGGATAACAGGCAAGCmTGAAGGAATCTCG 720 
QKPIVDKNDRDNRQAFEGIS 
TTTTCAGACrTIATtAATAAGAGCAATGATTTAATCAACAAAGACAATCTCATTGATBTA 8^0 
FSDFINKSNDLINKDNLIDV 
TGGGTGTCCCATCAAAACGATCCGTCTAAAATCAACACCCGATGGATCCeAAATTTTATG 960 
WVSHQNDPSKINTRSIRNFM 
GCCAAACAATCTTTTGCAGGAATCAnATAGGGAATCAAATCCGAACGGATCAAAAGTTC 1080 
AKOSFAGIIIGNOIRTDQ KF 
ACTGGTGGGGATTGGnGGATATTrrTCTCTCATTTATATTTGACAAAAAACAATCTTCT 1200 
TGQDHLDIFLSFIFDKKQSS 
ACCACCACCGACATACAA66CTTACC6CCTfiAAGCTA6A6ATTTACTTGATGAAA6B66T 1320 
TTTDIQGLPPEARDLLDtRb 
ATTGATCCCAAnACAAGnCAATCAATTATTGATTCACAATAACCCTCTGTCTTCTGTC IW 
lUPNYKFNQLLIHNNALSSV 
GGTGGTCCTGGAGCTAGGCATGAnGGAACGCCACCGTTGGTTATAAAGACCAACAAGGC 1560 
GGPGARHDWNATVGYKDQQG 
GGTGGIGAGAAAGGGAI UACAACCCTAGTTTTTATCTCTACAAAGAAGACCAACTCACA 1680 
GGCKGINNPSFYLYKEDQLT 
CTTGCACAAAATAATGCTAAAnAGACAACTTSAGCGAGAAAGAGAAGGAAAAATTCCGA 1800 
LAONNAKLDNL!>tKEKEKFR 
CGTAnGCTnTGTnCTAAAAAAGACACAAAACAnCAGCTTTAAnACTGAGTTTGGT 1920 
RIAFVSKIcnTKHSALITEFG 
AGGGAGAAAAATGTTACTCnCAAGGTAGCCTAAAACATGATGGCGTGATGTTTGTTGAT 20U0 
REKNVTLQGSLKHDGVMFVD 
AATGGCGmCCCATTTAGAAGTARfir.TTTAACAAGGTAGCTATCTTTAATTTGCCTGAT 2160 
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503 YSNFKYTNASKNPNK6V-C VT 

TTAAATAATCTC6CTATCACTAGnTCGTAA66C66AATTTA6AeeATAAACTAACCACT 
5il3LN NLAITSFVRRNL::DKLTT 

GAAnGCnGCAAAAACTTTAAACnCAATAAACCTGTACnGACGCTAAAAACACAGGC 
583 ELVGKTLNFNKAVADfl KNTG 

CATTTAGAGAAAGAAGTAGAGAAAAAAnGGAGAGCAAAAGCGGCAACAAAAATAAAATG 
623 HLEKEVEKKLESKSGNKNK M 

GCTAATAGAGflCGCAAGAGCAATCGCTTACGCTCAGAATCTTAflAGGCATCAAAAGGGAA 
663ANRDARAIAYAQNLKGI KRE 

GAAnCAAAAAT6GCAAAAATAAG6ATTTCA6CAA 6GCAGAAGAAACACTAAAA6CCCTT 
705 IE F K N G K N K D F ^ K| A E E T L K A L 

AATGCAGCTnGAA TGAATTCAAAAATGCCAAAAATAAGGATTTCAGCAA GGTAACGCAA 
713 N A A L N lEFKNGKNKDFS Kl V T Q 

AAAGTTGATAATCTCAATCAAGCGGTATCAGTGGCTAAAGCAACGGGTGATrrCAGTAGG 
783 KVDNLNQAVSVAKAT6-DFSR 

CAAAAAAATGAAAGTCTCAATGCTAGAAAAAAATCTGA^ATATATCAATCCGTTAAGAAT 
823 QKNESLNARKKSEIYQ.SVKN 

AAAAACTTTTCGGACATCAAGAAAGAGnGAATGCAAAACnGGAAATTT CAATAACAA T 
863KNFSD I KKELNA KL GNF |N N N 

CAAGCAGCTAGCCnGA AGAACCCATTTACGCT CAAGnGCTAAAAAGCTAAATCCAAAA 
903 Q A A S L E IE P I Y 71 QVAKKVN AK" 

CCTTTGAAAAGGCATGATAAAGTTGATGATCTCAGTAAGQT AtiGGn TTCAAGGAATCAft 
9J|3 PLKRHDKVDDLSKV| G L S R N 0 

TTTGGCAATCTAGAGCAAACGATAGACAAGCTCAAAGAnCTACAAAACACAATr.r.r.ATri 
983 FGNLE0TI DKLKD STKI+NPM 

TACCCTACTAACAGCCACATACGCAnAATAGCAATATCAAAAATGGAGCAATCAATGAA 
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NGVSHLEVGFNKVAIFNLPD 
AAAGGAnGTCCCCACAAGAAGCTAATAAGCnATCAAAGAI I J 1 1 IGAGCAGCAACAAA 228U 
KGLSPQEANKLIKDFLSSNK 
AATTATGATGAAGTGAAAAAAGCTCAGAAAGATCTTGAAAAATCTCTAAGGAAACGAGAG 2^*00 
NYDEVKKAQKDLEKSLRKKE 
GAAGCAAAAGCTCAAGCTAACAGCCAAAAAGATGAGATTTTTGCGTTGATCAATAAAGAG 2520 
EAKAOANSQKDEIFALINKE 
TTGTCTGATAAACTT6AAAATGTCAACAA6AATTT6AAAGACTTT5ATAAATCTTTTGAT 26A0 
LSDKLENVNKNLKDFDKSFD 
AAAG6TTCGfiTfiAAAfiAmAGRTATCAATrX.AriAATGGATTTCAAAAGTTGAAAACCTT 2760 
KGSVKDLGINPEWISKV ENL 
GCAAAAAGCGACCTTGAAAATTCCGTTAAA6ATGTGATCATCAATCAAAAGGTAAC6GAT 2880 
AKSDLENSVKDVIINQKVTD 
GTAGAGCAAGCGTTAGCCGATCTCAAAAATTTCTCAAAGGAGCAAnGGCCCAACAAGCT 3000 
VEQALADLKNFSKEQL AQQA 
GGTGTGAATGGAACCCTAGTC6GTAATG6GTTATCTCAAGCAGA:;\GCCACAACTCTTTCT 3120 
GVNGTLVGNGLSQAEATT'lS 
AACAATAA TGGACTCAAAAA CGAACCCATTTATGC TAAAGTTAATAAAAAGAAAGCAGGG 5240 
N N Nl G L K N |E P I Y A) K V N K K K A 6 
ATTGACCGACTGAATCAAATAGCAAGTGGrrTGGGTGTTGTAGGGCAAGGAGCGGGCTTC 3360 
IDRLNQIASGLGVVGQAAGfr^ 
6AATT6GC1CAGAAAATTGACAATCTCAATCAAGCGGTATCAGAAGCTAAAGCAGGTTTT 3^*80 
ELAQKIDNLNQAVSEAKA6F 
AATCTATGGGTTGAAAGTGCAAAAAAAGTACCT6CTA6TTTGTCAGCGAAACTA6ACAA.T 3600 
NLWVESAKKVPASLSAKLDN 
AAAGCGACCGGCATGCTAACGCAAAAAAACCCTGAGT6GCTCAAGCTCGT6AAT6ATAA6 3720 



Fia4D 



wo 93/18150 



PCr/EP93/00472 



10 / u 



1023 YATI^SHI RINSNIKNGAIN 
ATAGTTGCGCATAATGTAGGAAGCGTTCCTTTGTCAGAGTATGATAAAAnGGCnC 

1063 IVAHNVGSVPLSEYDKIGF 
GTAAAAGACACTAAnCIGGGmACriCAATTmAACCAATGCATTTTCTAr^ 

1103 VKDTNSGFTQFLTNAFSTA 
GGmCCAAAAATCnAAAGGATTAAGGAATACCAAAAACGCAAAAACCMCCCTTG 

1143 G F Q K S 

TGAATGCTACCAAnCATGGTATCATATCCCCATACAnCGTATCTAGCGTAGGAAG 
AACTCTGTAAAATCCCTAnATAGGGACACAGAGTGAGAACCAAACTCTCCCTACGG 
GACAGACACTAACGAAAGGCnTGTTCTTTAAAGTCTGCATGGATATTTCCTACCCC 
CGAAAAnAAnAAGGGTTATAAAGACAGCATAAACTAGAAAAAAGAAGTAGCTATA 
GAAAAATCAGAAAAACCATAGGAATTATCACACCTTATAATGCCCAAAAAAGACGCT 
ATGCCTTTCAAGGTGAAGAGGCAGATAnATrATTTAnCCACCGTGAAAACTTGTG 
ATCTCATTnTGTGGGTAAAAAGTCTrrCTTTGAGAATTTATGAAGCGATGAGAA&A 
CATTCTTCGCTTCAAAACGCTTTCATAAATCTCTCTAAAGCGCTTTATAATCAACAC 
mnAGCGnACAAmGAGCCAnCmAGCnGTTTTTCTAGC€AGATCACATC 
CTGCAAATATCCTACAATAGCATC6CCC6AAT66AT6AGTA6G66666TGTTGAAAG 
TAAAAfAA I CAC nCGGGAAAATCTTTAAGGGAGTGAAATAATAACGCATGCAAGTT 
TGCCAAACATTCAAATAGCCrTGnGTnCAGGGCATTGTCATAAGCGTTGGATTGG 
GCTAAAATGCmGCTCAATCACGCCCACAATAGGGATTTTGGAATGCrnrTGCATC 
TTGAAAAAATCCAAAGCCTCTAAGCCAAAnGCnGATCGTAGTGGGGTCTTTAGTG 
A66CTTTTTAAAAC6CTAAACCCTCCCACACCGCTATCAAAAACGCCTATTTTCATG 
TCTTCATTGTCCTTAGTTTGTTGCATTTTAGAATAGACAAAGCTT 5925 



FIG. 4E 



wo 93/18150 



PCr/Er93/0W72 



n / 1A 



EKATGHLTQKNPEWI ICLVNJJK 

AACCAGAAGAATATGAAAGATTAnCTGATTCSnCAAGnrrCCACCAAGnGAACAATGCT 3840 
NOKNHKDYSDSFKFSTKLNNA 

TCTTATTACT6CTTG6CGAGAGAAAAT6CGGAGCATGGAATCAA6AACGTTAATACAAAAG6T 3960 

Y Y C L A R ENAEHGIKNVNTKG 

nAAAAGCGMGGGITTmAATACTCCn^ 4080 

TGrGCAAAGTTACGCCTTTGGAGATATGATGTGTGAGACCTGTAGGGAATGCGnGGAeCTCA ^200 

GCAACATCAGCCTAGGAAGCCCAATCGTCTTTAGC66TTGGGCACTTCACCTTAAAATATCCC 4320 

AAAAAGACnAACCCTTTGCTTAAAAnAAGTTTGATTGTGCTAGTGGGTTCGTGCTATAGTG ^I^I^O 

ACAAAGATCAAGnCAAAAAATCATAGAGCTTnAGAGCAAATTGATCGCGercnAACCAAA 4560 

TGCGATCAGAAGT6GAAAAATACG6CTTCAA6AATTnGATGAGCTCAAAATAGA.CACTGTG& 4680 

GTAATCmCTTTCTTGCTAGATTCTAAACGCTTGAATGTGGCTATTTCTAGGGCAAAAGAAA 4.800 

ATATCmAGCGCTATmGCAAGTCTGTAGATAGGTAATCTTTTCCAAAGATAATCATTAGA 4920 

AATACCCnATAGTGTGAGCTATAGCCCCTTTnGG6AATTGAGTTJ\Tm6ACmAAATTT 5040 

GCCGCTrGCATGAAAnCCACmAGGGAATGCGTGTGCATTTTTmAAGGGCGTATTm 5160 

GGCAAAATGCTCCATAAAATAGCCCTCAATTTTTTGAGCGATTAA666AAAATGCGTGCAACC 5280 

TCTAACAATTCGCCCTCTAAAATACTnCTTCAATr.AAAGGCACAAAAAGAGAAGlGGCTAAA 5400 

■ ATCGTCGCmTGTCCCTAGCACTAAAATAGGGGCGTTrrTATCTTTTACnGTCGCTTGATC 5520 

TCnCTAAAGCTAGACCGCTCGCTGTGnGCATGCCACAATCAATAATTCAATCTGGTGCGGT 5640 

CCA TAAGGCA CTCTAGCCGTATCGCCATAATA6AT6AT I I CA TCAAATAATTGCGCTTTTAAA 5760 

ACALI 1 ! 1 1 i AAmAATGGGAnAAnAGGGATmATTmCATTCATTAAGTTTAAAAAT 5880 
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10 30 50 

AAGCnGCTGTCATGATCACAAAAAACACTAAAAAACATTATTAnAMGATACAAAATG 

. M 

70 90 110 ; 

GCAAAAGAAATCAAATmCAGATAGTGCGAGAAACCnnATTTGAAGGCGTGAGGCAA 
AKEIKFSDSAR NL L-FEGVRQ 

130 150'" " 170 

CTCCATGACGCTGTCAAAGTAACCATGGfiGCCAAGAGGCAGGAATGTATTGATCCAAAAA 
LHDAVKVTHGPRGRNVLIQK 

190 210 230 

AGCTATGGCGCTCeAAGCATCACCAAAGACGGCGTGAGCGTGGCTAAAfiAGATTGAATTA 
SYGAPSITKDGVSVAKEIEL 

250 270 290 

AGnGCCCAGTAGCTAACATGGGCGCTCAACTCGnAAAGAAGTAGCGAGCAAAACCGCT 
SCPVANMGAOLVKEVASKTA 

310 330 350 

GATGCTGCC6GCGATGGCAC6ACCACAGC6ACCGT6CTA6CTTATAGCATTTTTAAAGAA 
DAAGDGTTTATVLAYSIF KE. 

570 390 ^10 

GGTTTGAGGAATATCACGGCTGGGGCTAACCCTAnGAAGTGAAACGAGGCATGGATAAA 
6LRNITA6ANPIEV KR6-MDK 

^5Q 450 470 

GCTGCTGAAGCGATCAnAATGAGCnAAAAAAGCGAGCAAAAAAGTAGGCGGTAAAGAA 
AAEA IINELKKASKKVGGKE 

490 510 530 

GAAATCACCCAAGTGGCGACCATTTCTGCAAACTCCGATCACAATATCGGGAAACTCATC 
EITQVATISANSDHNIGKLI 

550 570 590 

GCTGACGCTATGGAAAAAGTGGGTAAAGACGGCGTGATCACCGTTGAGGAAGCTAAG6GC 
ADAMEKVGKD6VITVEEAKG 

610 630 , 650 

ATTGAA6ATGAATT6GATGTC6TAGAA6GCATGCAATTTGATA6AGGCTACCTCTCCCCT 
IFHE LDVVFGMQFDRGYLSP 
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570 690 /lO 

TATTnGTAACGAACGCTGAGAAAATGACCGCTCAAnGGATAATGCnACATCCTTTTA 
YF VtNAEKMTAQLDW.AYILL 

730 750 770 

ACGGATAAAAAAATCTCTAGCATGAAAGACAnCTCCCGCTACTAGAAAAAACCATGAAA 
TDKKISSMK DILPLLEKTnK 

790 810 Hindlll 

GAGGGCAAACCGCTTTTAATCATCGCTGAAGACATTGAGGGC GAAGCTT 'fAAL-b'ACTCTA 
E6KPLLIIAED1EGEALTTL 

850 870 890 

GTGGTGAATAAAnAAGAGGCGTGTTGAATATCGCAGCGGTTAAAGCTCCAGGCTTTGGG 
VVNKL RGVLNIAAVKAP6F G 

910 • 930 950 

GACAGAAGAAAAGAAATGCTCAAAGACATCGCTATTTTAACCGGCG6TCAAGTCATTAGC 
DRRKFMLKDIAILT6 GQVIS 

970 990 1010 

GAAGAATTGGGCnGAGTCTAGAAAACGCTGAAGTGGAGTTTTTAGGCAAAGCTGGAAGG 
EELGLSLENAEVEFLGK AGR 

1030 1050 ' 1U7U 

ATTGTGAnGACAAAGACAACACCACGATCGTAGATGGCAAAGGCCATAdCGATGATGn 
IVIDKDNTTIV n-GK GHSDDV 

1030 1110 1130 

AAAGACA6AGTCGCGCA6ATCAAAACCCAAATT6CAAGTACGACAAGCGATTATGACAAA 
KDKVAUIKTOIASTT.SDY DK 

1150 1170 1190 

GAAAAATT6CAAGAAAGAn66CTAAACTCTCTGGCG6T6T66CTGT6AnAAAGT666G 
EKLQERLAKLSGGVAVIKVG 

1210 1230 1250 

GCT6CGAGTGAAGTGGAAATGAAAGA6AAAAAAGACC666T66AT6ACGC6TTGA6CGCG 
AASEVEM. KEKKDHVDDALSA 

1270 1290 1310 

flCTAAArirXiGr.riGTTGAAGAAGGCATTGTGAnGGTGGCGGTGCGGCTCTCATTCGCGCG 
TKAAVEEGIVIGGGAALIRA 
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1330 1350 1370 ; 

6CTCAAAAAGTGCATTT6AATTTGCACGAT6ATGAAAAAGTGGGCTATGAAATCATCATG 
AQKVHLNLHDDEKVGYEI IM 

1390 1110 IWO 

CGCGCCATTAAAGCCCCATTAGCTCAAATCGCTATCAACGCTGGmTGATGGCGGTGTG 
RA IKAPLAQIAINAGYDG6 V 

1150 1170 1190 

GTCGTGAATGAAGTAGAAAAACACGAAGGGCATTTTGGTnTAACGCTAGCAATGGCAAG 
VVNEVEKHE6HFGFNASN GK 

1510 1530 1550 

TATGTGGATATGTrTAAAGAAGGCATTATTGACCCCTTAAAACTAGAAAGGATCGaCTA 
YVDMFKE6IIDPLKVE RIAI. 

1570 1590 ' IblO ' 

CAAAATGCGGTrTCGCrrTCAAGCCTGCTTTTAACCACAGAAGCCACCGTGCATGAAATC 
QNAVSVSSLLLTTEATVHEI 

1630 1650 1670 ' 

AAACAAGAAAAAGCGACTCCGCCAATGCCTGATATGGGTG6CATGGGC6GTAT6G6A6GC 
KEEKATPAnPDflGGM.GG'MGG 

1690 1710 1730 

ATGGGCG6CATGAT6TAA6CCCGCTTGCTTTTTA6TATAATCT6CTTTTAAAATCCCTTC 
M G G M M * 

1750 1770 1790 

TCTAAATCCCCCCCmCTAAAATCTCTTTTTTGGGGGGGTGCTTTGATAAAACCGCTCG 

1810 1830 
CTTGTAAAAACATGCAACAAAAAATCTCTGnAAGCTT 
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